Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marco Vassura is active.

Publication


Featured researches published by Marco Vassura.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2008

Reconstruction of 3D Structures From Protein Contact Maps

Marco Vassura; Luciano Margara; P. Di Lena; Filippo Medri; P. Fariselli; Rita Casadio

The prediction of the protein tertiary structure from solely its residue sequence (the so called Protein Folding Problem) is one of the most challenging problems in Structural Bioinformatics. We focus on the protein residue contact map. When this map is assigned it is possible to reconstruct the 3D structure of the protein backbone. The general problem of recovering a set of 3D coordinates consistent with some given contact map is known as a unit-disk-graph realization problem and it has been recently proven to be NP-Hard. In this paper we describe a heuristic method (COMAR) that is able to reconstruct with an unprecedented rate (3-15 seconds) a 3D model that exactly matches the target contact map of a protein. Working with a non-redundant set of 1760 proteins, we find that the scoring efficiency of finding a 3D model very close to the protein native structure depends on the threshold value adopted to compute the protein residue contact map. Contact maps whose threshold values range from 10 to 18 Aringngstroms allow reconstructing 3D models that are very similar to the proteins native structure.


Bioinformatics | 2008

FT-COMAR

Marco Vassura; Luciano Margara; Pietro Di Lena; Filippo Medri; Piero Fariselli; Rita Casadio

UNLABELLED Fault Tolerant Contact Map Reconstruction (FT-COMAR) is a heuristic algorithm for the reconstruction of the protein three-dimensional structure from (possibly) incomplete (i.e. containing unknown entries) and noisy contact maps. FT-COMAR runs within minutes, allowing its application to a large-scale number of predictions. AVAILABILITY http://bioinformatics.cs.unibo.it/FT-COMAR


Biodata Mining | 2011

Blurring contact maps of thousands of proteins: what we can learn by reconstructing 3D structure

Marco Vassura; Pietro Di Lena; Luciano Margara; Maria Mirto; Giovanni Aloisio; Piero Fariselli; Rita Casadio

BackgroundThe present knowledge of protein structures at atomic level derives from some 60,000 molecules. Yet the exponential ever growing set of hypothetical protein sequences comprises some 10 million chains and this makes the problem of protein structure prediction one of the challenging goals of bioinformatics. In this context, the protein representation with contact maps is an intermediate step of fold recognition and constitutes the input of contact map predictors. However contact map representations require fast and reliable methods to reconstruct the specific folding of the protein backbone.MethodsIn this paper, by adopting a GRID technology, our algorithm for 3D reconstruction FT-COMAR is benchmarked on a huge set of non redundant proteins (1716) taking random noise into consideration and this makes our computation the largest ever performed for the task at hand.ResultsWe can observe the effects of introducing random noise on 3D reconstruction and derive some considerations useful for future implementations. The dimension of the protein set allows also statistical considerations after grouping per SCOP structural classes.ConclusionsAll together our data indicate that the quality of 3D reconstruction is unaffected by deleting up to an average 75% of the real contacts while only few percentage of randomly generated contacts in place of non-contacts are sufficient to hamper 3D reconstruction.


Human Mutation | 2011

Correlating disease related mutations to their effect on protein stability: a large scale analysis of the human proteome

Rita Casadio; Marco Vassura; Shalinee Tiwari; Piero Fariselli; Pier Luigi Martelli

Single residue mutations in proteins are known to affect protein stability and function. As a consequence, they can be disease associated. Available computational methods starting from protein sequence/structure can predict whether a mutated residue is or not disease associated and whether it is promoting instability of the protein‐folded structure. However, the relationship among stability changes in proteins and their involvement in human diseases still needs to be fully exploited. Here, we try to rationalize in a nutshell the complexity of the question by generalizing over information already stored in public databases. For each single aminoacid polymorphysm (SAP) type, we derive the probability of being disease‐related (Pd) and compute from thermodynamic data three indexes indicating the probability of decreasing (P−), increasing (P+), and perturbing the protein structure stability (Pp). Statistically validated analysis of the different P/Pd correlations indicate that Pd best correlates with Pp. Pp/Pd correlation values are as high as 0.49, and increase up to 0.67 when data variability is taken into consideration. This is indicative of a medium/good correlation among Pd and Pp and corroborates the assumption that protein stability changes can also be disease associated at the proteome level. Hum Mutat 32:1161–1170, 2011. ©2011 Wiley‐Liss, Inc.


Bioinformatics | 2010

Fast overlapping of protein contact maps by alignment of eigenvectors

Pietro Di Lena; Piero Fariselli; Luciano Margara; Marco Vassura; Rita Casadio

MOTIVATION Searching for structural similarity is a key issue of protein functional annotation. The maximum contact map overlap (CMO) is one of the possible measures of protein structure similarity. Exact and approximate methods known to optimize the CMO are computationally expensive and this hampers their applicability to large-scale comparison of protein structures. RESULTS In this article, we describe a heuristic algorithm (Al-Eigen) for finding a solution to the CMO problem. Our approach relies on the approximation of contact maps by eigendecomposition. We obtain good overlaps of two contact maps by computing the optimal global alignment of few principal eigenvectors. Our algorithm is simple, fast and its running time is independent of the amount of contacts in the map. Experimental testing indicates that the algorithm is comparable to exact CMO methods in terms of the overlap quality, to structural alignment methods in terms of structure similarity detection and it is fast enough to be suited for large-scale comparison of protein structures. Furthermore, our preliminary tests indicates that it is quite robust to noise, which makes it suitable for structural similarity detection also for noisy and incomplete contact maps. AVAILABILITY Available at http://bioinformatics.cs.unibo.it/Al-Eigen.


international symposium on bioinformatics research and applications | 2007

Reconstruction of 3D structures from protein contact maps

Marco Vassura; Luciano Margara; Filippo Medri; Pietro Di Lena; Piero Fariselli; Rita Casadio

Proteins are large organic compounds made of amino acids arranged in a linear chain (primary structure). Most proteins fold into unique three-dimensional (3D) structures called interchangeably tertiary, folded, or native structures. Discovering the tertiary structure of a protein (Protein Folding Problem) can provide important clues about how the protein performs its function and it is one of the most important problems in Bioinformatics. A contact map of a given protein P is a binary matrix M such that Mi,j = 1 iff the physical distance between amino acids i and j in the native structure is less than or equal to a pre-assigned threshold t. The contact map of each protein is a distinctive signature of its folded structure. Predicting the tertiary structure of a protein directly from its primary structure is a very complex and still unsolved problem. An alternative and probably more feasible approach is to predict the contact map of a protein from its primary structure and then to compute the tertiary structure starting from the predicted contact map. This last problem has been recently proven to be NP-Hard [6]. In this paper we give a heuristic method that is able to reconstruct in a few seconds a 3D model that exactly matches the target contact map. We wish to emphasize that our method computes an exact model for the protein independently of the contact map threshold. To our knowledge, our method outperforms all other techniques in the literature [5,10,17,19] both for the quality of the provided solutions and for the running times. Our experimental results are obtained on a non-redundant data set consisting of 1760 proteins which is by far the largest benchmark set used so far. Average running times range from 3 to 15 seconds depending on the contact map threshold and on the size of the protein. Repeated applications of our method (starting from randomly chosen distinct initial solutions) show that the same contact map may admit (depending on the threshold) quite different 3D models. Extensive experimental results show that contact map thresholds ranging from 10 to 18 Angstrom allow to reconstruct 3D models that are very similar to the proteins native structure. Our Heuristic is freely available for testing on the web at the following url: http://vassura.web.cs.unibo.it/cmap23d/


Artificial Intelligence in Medicine | 2009

A graph theoretic approach to protein structure selection

Marco Vassura; Luciano Margara; Piero Fariselli; Rita Casadio

OBJECTIVE Protein structure prediction (PSP) aims to reconstruct the 3D structure of a given protein starting from its primary structure (chain of amino acidic residues). It is a well-known fact that the 3D structure of a protein only depends on its primary structure. PSP is one of the most important and still unsolved problems in computational biology. Protein structure selection (PSS), instead of reconstructing a 3D model for the given chain, aims to select among a given, possibly large, number of 3D structures (called decoys) those that are closer (according to a given notion of distance) to the original (unknown) one. In this paper we address PSS problem using graph theoretic techniques. METHODS AND MATERIALS Existing methods for solving PSS make use of suitably defined energy functions which heavily rely on the primary structure of the protein and on protein chemistry. In this paper we present a new approach to PSS which does not take advantage of the knowledge of the primary structure of the protein but only depends on the graph theoretic properties of the decoys graphs (vertices represent residues and edges represent pairs of residues whose Euclidean distance is less than or equal to a fixed threshold). RESULTS Even if our methods only rely on approximate geometric information, experimental results show that some of the adopted graph properties score similarly to energy-based filtering functions in selecting the best decoys. CONCLUSION Our results highlight the principal role of geometric information in PSS, setting a new starting point and filtering method for existing energy function-based techniques.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2011

Is There an Optimal Substitution Matrix for Contact Prediction with Correlated Mutations

P. Di Lena; P. Fariselli; Luciano Margara; Marco Vassura; Rita Casadio

Correlated mutations in proteins are believed to occur in order to preserve the protein functional folding through evolution. Their values can be deduced from sequence and/or structural alignments and are indicative of residue contacts in the protein three-dimensional structure. A correlation among pairs of residues is routinely evaluated with the Pearson correlation coefficient and the MCLACHLAN similarity matrix. In literature, there is no justification for the adoption of the MCLACHLAN instead of other substitution matrices. In this paper, we approach the problem of computing the optimal similarity matrix for contact prediction with correlated mutations, i.e., the similarity matrix that maximizes the accuracy of contact prediction with correlated mutations. We describe an optimization procedure, based on the gradient descent method, for computing the optimal similarity matrix and perform an extensive number of experimental tests. Our tests show that there is a large number of optimal matrices that perform similarly to MCLACHLAN. We also obtain that the upper limit to the accuracy achievable in protein contact prediction is independent of the optimized similarity matrix. This suggests that the poor scoring of the correlated mutations approach may be due to the choice of the linear correlation function in evaluating correlated mutations.


Mathematical Approaches to Polymer Sequence Analysis and Related Problems | 2011

Divide and Conquer Strategies for Protein Structure Prediction

Pietro Di Lena; Piero Fariselli; Luciano Margara; Marco Vassura; Rita Casadio

In this chapter, we discuss some approaches to the problem of protein structure prediction by addressing “simpler” sub-problems. The rationale behind this strategy is to develop methods for predicting some interesting structural characteristics of the protein, which can be useful per se and, at the same time, can be of help in solving the main problem. In particular, we discuss the problem of predicting the protein secondary structure, which is at the moment one of the most successful sub-problems addressed in computational biology. Available secondary structure predictors are very reliable and can be routinely used for annotating new genomes or as input for other more complex prediction tasks, such as remote homology detection and functional assignments. As a second example, we also discuss the problem of predicting residue–residue contacts in proteins. In this case, the task is much more complex than secondary structure prediction, and no satisfactory results have been achieved so far. Differently from the secondary structure sub-problem, the residue–residue contact sub-problem is not intrinsically simpler than the prediction of the protein structure, since a roughly correctly predicted set of residue–residue contacts would directly lead to prediction of a protein backbone very close to the real structure. These two protein structure sub-problems are discussed in the light of the current evaluation of the performance that are based on periodical blind-checks (CASP meetings) and permanent evaluation (EVA servers).


Algorithms | 2009

On the Reconstruction of Three-dimensional Protein Structures from Contact Maps

Pietro Di Lena; Marco Vassura; Luciano Margara; Piero Fariselli; Rita Casadio

The problem of protein structure prediction is one of the long-standing goals of Computational Biology. Although we are still not able to provide first principle solutions, several shortcuts have been discovered to compute the protein three-dimensional structure when similar protein sequences are available (by means of comparative modeling and remote homology detection). Nonetheless, these approaches can assign structures only to a fraction of proteins in genomes and ab-initio methods are still needed. One relevant step of ab-initio prediction methods is the reconstruction of the protein structures starting from inter-protein residue contacts. In this paper we review the methods developed so far to accomplish the reconstruction task in order to highlight their differences and similarities. The different ap- proaches are fully described and their reported performances, together with their computa- tional complexity, are also discussed.

Collaboration


Dive into the Marco Vassura's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge