Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where David Simoncini is active.

Publication


Featured researches published by David Simoncini.


Proceedings of the National Academy of Sciences of the United States of America | 2014

Computational design of a self-assembling symmetrical β-propeller protein

Arnout Voet; Hiroki Noguchi; Christine Addy; David Simoncini; Daiki Terada; Satoru Unzai; Sam-Yong Park; Kam Y. J. Zhang; Jeremy R. H. Tame

Significance In this study, we have designed and experimentally validated, to our knowledge, the first perfectly symmetrical β-propeller protein. Our results provide insight not only into protein evolution through duplication events, but also into methods for creating designer proteins that self-assemble according to simple arithmetical rules. Such proteins may have very wide uses in bionanotechnology. Furthermore our design approach is both rapid and applicable to many different protein templates. Our novel propeller protein consists of six identical domains known as “blades.” Using a variety of biophysical techniques, we show it to be highly stable and report several high-resolution crystal structures of different forms of the protein. Domain swapping allows us to generate related oligomeric forms with fixed numbers of blades per complex. The modular structure of many protein families, such as β-propeller proteins, strongly implies that duplication played an important role in their evolution, leading to highly symmetrical intermediate forms. Previous attempts to create perfectly symmetrical propeller proteins have failed, however. We have therefore developed a new and rapid computational approach to design such proteins. As a test case, we have created a sixfold symmetrical β-propeller protein and experimentally validated the structure using X-ray crystallography. Each blade consists of 42 residues. Proteins carrying 2–10 identical blades were also expressed and purified. Two or three tandem blades assemble to recreate the highly stable sixfold symmetrical architecture, consistent with the duplication and fusion theory. The other proteins produce different monodisperse complexes, up to 42 blades (180 kDa) in size, which self-assemble according to simple symmetry rules. Our procedure is suitable for creating nano-building blocks from different protein templates of desired symmetry.


PLOS ONE | 2012

A Probabilistic Fragment-Based Protein Structure Prediction Algorithm

David Simoncini; Francois Berenger; Rojan Shrestha; Kam Y. J. Zhang

Conformational sampling is one of the bottlenecks in fragment-based protein structure prediction approaches. They generally start with a coarse-grained optimization where mainchain atoms and centroids of side chains are considered, followed by a fine-grained optimization with an all-atom representation of proteins. It is during this coarse-grained phase that fragment-based methods sample intensely the conformational space. If the native-like region is sampled more, the accuracy of the final all-atom predictions may be improved accordingly. In this work we present EdaFold, a new method for fragment-based protein structure prediction based on an Estimation of Distribution Algorithm. Fragment-based approaches build protein models by assembling short fragments from known protein structures. Whereas the probability mass functions over the fragment libraries are uniform in the usual case, we propose an algorithm that learns from previously generated decoys and steers the search toward native-like regions. A comparison with Rosetta AbInitio protocol shows that EdaFold is able to generate models with lower energies and to enhance the percentage of near-native coarse-grained decoys on a benchmark of proteins. The best coarse-grained models produced by both methods were refined into all-atom models and used in molecular replacement. All atom decoys produced out of EdaFold’s decoy set reach high enough accuracy to solve the crystallographic phase problem by molecular replacement for some test proteins. EdaFold showed a higher success rate in molecular replacement when compared to Rosetta. Our study suggests that improving low resolution coarse-grained decoys allows computational methods to avoid subsequent sampling issues during all-atom refinement and to produce better all-atom models. EdaFold can be downloaded from http://www.riken.jp/zhangiru/software/.


Journal of Chemical Theory and Computation | 2015

Guaranteed Discrete Energy Optimization on Large Protein Design Problems

David Simoncini; David Allouche; Simon de Givry; Céline Delmas; Sophie Barbe; Thomas Schiex

In Computational Protein Design (CPD), assuming a rigid backbone and amino-acid rotamer library, the problem of finding a sequence with an optimal conformation is NP-hard. In this paper, using Dunbracks rotamer library and Talaris2014 decomposable energy function, we use an exact deterministic method combining branch and bound, arc consistency, and tree-decomposition to provenly identify the global minimum energy sequence-conformation on full-redesign problems, defining search spaces of size up to 10(234). This is achieved on a single core of a standard computing server, requiring a maximum of 66GB RAM. A variant of the algorithm is able to exhaustively enumerate all sequence-conformations within an energy threshold of the optimum. These proven optimal solutions are then used to evaluate the frequencies and amplitudes, in energy and sequence, at which an existing CPD-dedicated simulated annealing implementation may miss the optimum on these full redesign problems. The probability of finding an optimum drops close to 0 very quickly. In the worst case, despite 1,000 repeats, the annealing algorithm remained more than 1 Rosetta unit away from the optimum, leading to design sequences that could differ from the optimal sequence by more than 30% of their amino acids.


Journal of Computational Chemistry | 2012

Durandal: fast exact clustering of protein decoys.

Francois Berenger; Rojan Shrestha; Yong Zhou; David Simoncini; Kam Y. J. Zhang

In protein folding, clustering is commonly used as one way to identify the best decoy produced. Initializing the pairwise distance matrix for a large decoy set is computationally expensive. We have proposed a fast method that works even on large decoy sets. This method is implemented in a software called Durandal. Durandal has been shown to be consistently faster than other software performing fast exact clustering. In some cases, Durandal can even outperform the speed of an approximate method. Durandal uses the triangular inequality to accelerate exact clustering, without compromising the distance function. Recently, we have further enhanced the performance of Durandal by incorporating a Quaternion‐based characteristic polynomial method that has increased the speed of Durandal between 13% and 27% compared with the previous version. Durandal source code is available under the GNU General Public License at http://www.riken.jp/zhangiru/software/durandal_released_qcp.tgz. Alternatively, a compiled version of Durandal is also distributed with the nightly builds of the Phenix (http://www.phenix‐online.org/) crystallographic software suite (Adams et al., Acta Crystallogr Sect D 2010, 66, 213).


PLOS ONE | 2013

Efficient sampling in fragment-based protein structure prediction using an estimation of distribution algorithm.

David Simoncini; Kam Y. J. Zhang

Fragment assembly is a powerful method of protein structure prediction that builds protein models from a pool of candidate fragments taken from known structures. Stochastic sampling is subsequently used to refine the models. The structures are first represented as coarse-grained models and then as all-atom models for computational efficiency. Many models have to be generated independently due to the stochastic nature of the sampling methods used to search for the global minimum in a complex energy landscape. In this paper we present , a fragment-based approach which shares information between the generated models and steers the search towards native-like regions. A distribution over fragments is estimated from a pool of low energy all-atom models. This iteratively-refined distribution is used to guide the selection of fragments during the building of models for subsequent rounds of structure prediction. The use of an estimation of distribution algorithm enabled to reach lower energy levels and to generate a higher percentage of near-native models. uses an all-atom energy function and produces models with atomic resolution. We observed an improvement in energy-driven blind selection of models on a benchmark of in comparison with the AbInitioRelax protocol.


Acta Crystallographica Section D-biological Crystallography | 2012

Error-estimation-guided rebuilding of de novo models increases the success rate of ab initio phasing.

Rojan Shrestha; David Simoncini; Kam Y. J. Zhang

Recent advancements in computational methods for protein-structure prediction have made it possible to generate the high-quality de novo models required for ab initio phasing of crystallographic diffraction data using molecular replacement. Despite those encouraging achievements in ab initio phasing using de novo models, its success is limited only to those targets for which high-quality de novo models can be generated. In order to increase the scope of targets to which ab initio phasing with de novo models can be successfully applied, it is necessary to reduce the errors in the de novo models that are used as templates for molecular replacement. Here, an approach is introduced that can identify and rebuild the residues with larger errors, which subsequently reduces the overall C(α) root-mean-square deviation (CA-RMSD) from the native protein structure. The error in a predicted model is estimated from the average pairwise geometric distance per residue computed among selected lowest energy coarse-grained models. This score is subsequently employed to guide a rebuilding process that focuses on more error-prone residues in the coarse-grained models. This rebuilding methodology has been tested on ten protein targets that were unsuccessful using previous methods. The average CA-RMSD of the coarse-grained models was improved from 4.93 to 4.06 Å. For those models with CA-RMSD less than 3.0 Å, the average CA-RMSD was improved from 3.38 to 2.60 Å. These rebuilt coarse-grained models were then converted into all-atom models and refined to produce improved de novo models for molecular replacement. Seven diffraction data sets were successfully phased using rebuilt de novo models, indicating the improved quality of these rebuilt de novo models and the effectiveness of the rebuilding process. Software implementing this method, called MORPHEUS, can be downloaded from http://www.riken.jp/zhangiru/software.html.


Methods of Molecular Biology | 2017

Evolution-Inspired Computational Design of Symmetric Proteins

Arnout Voet; David Simoncini; Jeremy R. H. Tame; Kam Y. J. Zhang

Monomeric proteins with a number of identical repeats creating symmetrical structures are potentially very valuable building blocks with a variety of bionanotechnological applications. As such proteins do not occur naturally, the emerging field of computational protein design serves as an excellent tool to create them from nonsymmetrical templates. Existing pseudo-symmetrical proteins are believed to have evolved from oligomeric precursors by duplication and fusion of identical repeats. Here we describe a computational workflow to reverse-engineer this evolutionary process in order to create stable proteins consisting of identical sequence repeats.


Molecular Informatics | 2015

Quality Assessment of Predicted Protein Models Using Energies Calculated by the Fragment Molecular Orbital Method

David Simoncini; Hiroya Nakata; Koji Ogata; Shinichiro Nakamura; Kam Y. J. Zhang

Protein structure prediction directly from sequences is a very challenging problem in computational biology. One of the most successful approaches employs stochastic conformational sampling to search an empirically derived energy function landscape for the global energy minimum state. Due to the errors in the empirically derived energy function, the lowest energy conformation may not be the best model. We have evaluated the use of energy calculated by the fragment molecular orbital method (FMO energy) to assess the quality of predicted models and its ability to identify the best model among an ensemble of predicted models. The fragment molecular orbital method implemented in GAMESS was used to calculate the FMO energy of predicted models. When tested on eight protein targets, we found that the model ranking based on FMO energies is better than that based on empirically derived energies when there is sufficient diversity among these models. This model diversity can be estimated prior to the FMO energy calculations. Our result demonstrates that the FMO energy calculated by the fragment molecular orbital method is a practical and promising measure for the assessment of protein model quality and the selection of the best protein model among many generated.


modelling computation and optimization in information systems and management sciences | 2015

Approximate Counting with Deterministic Guarantees for Affinity Computation

Clément Viricel; David Simoncini; David Allouche; Simon de Givry; Sophie Barbe; Thomas Schiex

Computational Protein Design aims at rationally designing amino-acid sequences that fold into a given three-dimensional structure and that will bestow the designed protein with desirable properties/functions. Usual criteria for design include stability of the designed protein and affinity between it and a ligand of interest. However, estimating the affinity between two molecules requires to compute the partition function, a #P-complete problem.


Proteins | 2017

Balancing exploration and exploitation in population-based sampling improves fragment-basedde novoprotein structure prediction: Efficient Conformational Sampling Strategies

David Simoncini; Thomas Schiex; Kam Y. J. Zhang

Conformational search space exploration remains a major bottleneck for protein structure prediction methods. Population‐based meta‐heuristics typically enable the possibility to control the search dynamics and to tune the balance between local energy minimization and search space exploration. EdaFold is a fragment‐based approach that can guide search by periodically updating the probability distribution over the fragment libraries used during model assembly. We implement the EdaFold algorithm as a Rosetta protocol and provide two different probability update policies: a cluster‐based variation (EdaRosec) and an energy‐based one (EdaRoseen). We analyze the search dynamics of our new Rosetta protocols and show that EdaRosec is able to provide predictions with lower C α RMSD to the native structure than EdaRoseen and Rosetta AbInitio Relax protocol. Our software is freely available as a C++ patch for the Rosetta suite and can be downloaded from http://www.riken.jp/zhangiru/software/. Our protocols can easily be extended in order to create alternative probability update policies and generate new search dynamics. Proteins 2017; 85:852–858.

Collaboration


Dive into the David Simoncini's collaboration.

Top Co-Authors

Avatar

Kam Y. J. Zhang

Fred Hutchinson Cancer Research Center

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Thomas Schiex

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kam Y. J. Zhang

Fred Hutchinson Cancer Research Center

View shared research outputs
Top Co-Authors

Avatar

Arnout Voet

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

David Allouche

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Simon de Givry

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Clément Viricel

Institut national de la recherche agronomique

View shared research outputs
Researchain Logo
Decentralizing Knowledge