Castrense Savojardo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Castrense Savojardo is active.

Explore More

Publication

Featured researches published by Castrense Savojardo.

Bioinformatics | 2013

BETAWARE: a machine-learning tool to detect and predict transmembrane beta-barrel proteins in prokaryotes

Castrense Savojardo; Piero Fariselli; Rita Casadio

SUMMARY The annotation of membrane proteins in proteomes is an important problem of Computational Biology, especially after the development of high-throughput techniques that allow fast and efficient genome sequencing. Among membrane proteins, transmembrane β-barrels (TMBBs) are poorly represented in the database of protein structures (PDB) and difficult to identify with experimental approaches. They are, however, extremely important, playing key roles in several cell functions and bacterial pathogenicity. TMBBs are included in the lipid bilayer with a β-barrel structure and are presently found in the outer membranes of Gram-negative bacteria, mitochondria and chloroplasts. Recently, we developed two top-performing methods based on machine-learning approaches to tackle both the detection of TMBBs in sets of proteins and the prediction of their topology. Here, we present our BETAWARE program that includes both approaches and can run as a standalone program on a linux-based computer to easily address in-home massive protein annotation or filtering. AVAILABILITY AND IMPLEMENTATION http://www.biocomp.unibo.it/∼savojard/betawarecl .

Bioinformatics | 2011

Improving the prediction of disulfide bonds in Eukaryotes with machine learning methods and protein subcellular localization

Castrense Savojardo; Piero Fariselli; Monther Alhamdoosh; Pier Luigi Martelli; Andrea Pierleoni; Rita Casadio

MOTIVATION Disulfide bonds stabilize protein structures and play relevant roles in their functions. Their formation requires an oxidizing environment and their stability is consequently depending on the redox ambient potential, which may differ according to the subcellular compartment. Several methods are available to predict cysteine-bonding state and connectivity patterns. However, none of them takes into consideration the relevance of protein subcellular localization. RESULTS Here we develop DISLOCATE, a two-step method based on machine learning models for predicting both the bonding state and the connectivity patterns of cysteine residues in a protein chain. We find that the inclusion of protein subcellular localization improves the performance of these predictive steps by 3 and 2 percentage points, respectively. When compared with previously developed methods for predicting disulfide bonds from sequence, DISLOCATE improves the overall performance by more than 10 percentage points. AVAILABILITY The method and the dataset are available at the Web page http://www.biocomp.unibo.it/savojard/Dislocate.html. GRHCRF code is available at http://www.biocomp.unibo.it/savojard/biocrf.html. CONTACT [email protected].

Bioinformatics | 2015

INPS: predicting the impact of non-synonymous variations on protein stability from sequence.

Piero Fariselli; Pier Luigi Martelli; Castrense Savojardo; Rita Casadio

MOTIVATION A tool for reliably predicting the impact of variations on protein stability is extremely important for both protein engineering and for understanding the effects of Mendelian and somatic mutations in the genome. Next Generation Sequencing studies are constantly increasing the number of protein sequences. Given the huge disproportion between protein sequences and structures, there is a need for tools suited to annotate the effect of mutations starting from protein sequence without relying on the structure. Here, we describe INPS, a novel approach for annotating the effect of non-synonymous mutations on the protein stability from its sequence. INPS is based on SVM regression and it is trained to predict the thermodynamic free energy change upon single-point variations in protein sequences. RESULTS We show that INPS performs similarly to the state-of-the-art methods based on protein structure when tested in cross-validation on a non-redundant dataset. INPS performs very well also on a newly generated dataset consisting of a number of variations occurring in the tumor suppressor protein p53. Our results suggest that INPS is a tool suited for computing the effect of non-synonymous polymorphisms on protein stability when the protein structure is not available. We also show that INPS predictions are complementary to those of the state-of-the-art, structure-based method mCSM. When the two methods are combined, the overall prediction on the p53 set scores significantly higher than those of the single methods. AVAILABILITY AND IMPLEMENTATION The presented method is available as web server at http://inps.biocomp.unibo.it. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary Materials are available at Bioinformatics online.

Bioinformatics | 2014

TPpred2: improving the prediction of mitochondrial targeting peptide cleavage sites by exploiting sequence motifs.

Castrense Savojardo; Pier Luigi Martelli; Piero Fariselli; Rita Casadio

SUMMARY Targeting peptides are N-terminal sorting signals in proteins that promote their translocation to mitochondria through the interaction with different protein machineries. We recently developed TPpred, a machine learning-based method scoring among the best ones available to predict the presence of a targeting peptide into a protein sequence and its cleavage site. Here we introduce TPpred2 that improves TPpred performances in the task of identifying the cleavage site of the targeting peptides. TPpred2 is now available as a web interface and as a stand-alone version for users who can freely download and adopt it for processing large volumes of sequences. Availability and implementaion: TPpred2 is available both as web server and stand-alone version at http://tppred2.biocomp.unibo.it. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

Nucleic Acids Research | 2011

MemPype: a pipeline for the annotation of eukaryotic membrane proteins

Andrea Pierleoni; Valentina Indio; Castrense Savojardo; Piero Fariselli; Pier Luigi Martelli; Rita Casadio

MemPype is a Python-based pipeline including previously published methods for the prediction of signal peptides (SPEP), glycophosphatidylinositol (GPI) anchors (PredGPI), all-alpha membrane topology (ENSEMBLE), and a recent method (MemLoci) that specifically discriminates the localization of eukaryotic membrane proteins in: ‘cell membrane’, ‘internal membranes’, ‘organelle membranes’. MemLoci scores with accuracy of 70% and generalized correlation coefficient (GCC) of 0.50 on a rigorous homology-unbiased validation set and overpasses other predictors for subcellular localization. The annotation process is based both on inheritance through homology and computational methods. Each submitted protein first retrieves, when available, up to 25 similar proteins (with sequence identity ≥50% and alignment coverage ≥50% on both sequences). This helps the identification of membrane-associated proteins and detailed localization tags. Each protein is also filtered for the presence of a GPI anchor [0.8% false positive rate (FPR)]. A positive score of GPI anchor prediction labels the sequence as exposed to ‘Cell surface’. Concomitantly the sequence is analysed for the presence of a signal peptide and classified with MemLoci into one of three discriminated classes. Finally the sequence is filtered for predicting its putative all-alpha protein membrane topology (FPR <1%). The web server is available at: http://mu2py.biocomp.unibo.it/mempype.

Bioinformatics | 2011

Improving the detection of transmembrane β-barrel chains with N-to-1 extreme learning machines

Castrense Savojardo; Piero Fariselli; Rita Casadio

MOTIVATION Transmembrane β-barrels (TMBBs) are extremely important proteins that play key roles in several cell functions. They cross the lipid bilayer with β-barrel structures. TMBBs are presently found in the outer membranes of Gram-negative bacteria and of mitochondria and chloroplasts. Loop exposure outside the bacterial cell membranes makes TMBBs important targets for vaccine or drug therapies. In genomes, they are not highly represented and are difficult to identify with experimental approaches. Several computational methods have been developed to discriminate TMBBs from other types of proteins. However, the best performing approaches have a high fraction of false positive predictions. RESULTS In this article, we introduce a new machine learning approach for TMBB detection based on N-to-1 Extreme Learning Machines that significantly outperforms previous methods achieving a Matthews correlation coefficient of 0.82, a probability of correct prediction of 0.92 and a sensitivity of 0.73.

Bioinformatics | 2013

BCov: a method for predicting β-sheet topology using sparse inverse covariance estimation and integer programming

Castrense Savojardo; Piero Fariselli; Pier Luigi Martelli; Rita Casadio

MOTIVATION Prediction of protein residue contacts, even at the coarse-grain level, can help in finding solutions to the protein structure prediction problem. Unlike α-helices that are locally stabilized, β-sheets result from pairwise hydrogen bonding of two or more disjoint regions of the protein backbone. The problem of predicting contacts among β-strands in proteins has been addressed by several supervised computational approaches. Recently, prediction of residue contacts based on correlated mutations has been greatly improved and finally allows the prediction of 3D structures of the proteins. RESULTS In this article, we describe BCov, which is the first unsupervised method to predict the β-sheet topology starting from the protein sequence and its secondary structure. BCov takes advantage of the sparse inverse covariance estimation to define β-strand partner scores. Then an optimization based on integer programming is carried out to predict the β-sheet connectivity. When tested on the prediction of β-strand pairing, BCov scores with average values of Matthews Correlation Coefficient (MCC) and F1 equal to 0.56 and 0.61, respectively, on a non-redundant dataset of 916 protein chains known with atomic resolution. Our approach well compares with the state-of-the-art methods trained so far for this specific task. AVAILABILITY AND IMPLEMENTATION The method is freely available under General Public License at http://biocomp.unibo.it/savojard/bcov/bcov-1.0.tar.gz. The new dataset BetaSheet1452 can be downloaded at http://biocomp.unibo.it/savojard/bcov/BetaSheet1452.dat.

Bioinformatics | 2016

INPS-MD: a web server to predict stability of protein variants from sequence and structure

Castrense Savojardo; Piero Fariselli; Pier Luigi Martelli; Rita Casadio

MOTIVATION Protein function depends on its structural stability. The effects of single point variations on protein stability can elucidate the molecular mechanisms of human diseases and help in developing new drugs. Recently, we introduced INPS, a method suited to predict the effect of variations on protein stability from protein sequence and whose performance is competitive with the available state-of-the-art tools. RESULTS In this article, we describe INPS-MD (Impact of Non synonymous variations on Protein Stability-Multi-Dimension), a web server for the prediction of protein stability changes upon single point variation from protein sequence and/or structure. Here, we complement INPS with a new predictor (INPS3D) that exploits features derived from protein 3D structure. INPS3D scores with Pearsons correlation to experimental ΔΔG values of 0.58 in cross validation and of 0.72 on a blind test set. The sequence-based INPS scores slightly lower than the structure-based INPS3D and both on the same blind test sets well compare with the state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION INPS and INPS3D are available at the same web server: http://inpsmd.biocomp.unibo.it SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. CONTACT [email protected].

Algorithms for Molecular Biology | 2009

Grammatical-Restrained Hidden Conditional Random Fields for Bioinformatics applications

Piero Fariselli; Castrense Savojardo; Pier Luigi Martelli; Rita Casadio

AbstractBackgroundDiscriminative models are designed to naturally address classification tasks. However, some applications require the inclusion of grammar rules, and in these cases generative models, such as Hidden Markov Models (HMMs) and Stochastic Grammars, are routinely applied.ResultsWe introduce Grammatical-Restrained Hidden Conditional Random Fields (GRHCRFs) as an extension of Hidden Conditional Random Fields (HCRFs). GRHCRFs while preserving the discriminative character of HCRFs, can assign labels in agreement with the production rules of a defined grammar. The main GRHCRF novelty is the possibility of including in HCRFs prior knowledge of the problem by means of a defined grammar. Our current implementation allows regular grammar rules. We test our GRHCRF on a typical biosequence labeling problem: the prediction of the topology of Prokaryotic outer-membrane proteins.ConclusionWe show that in a typical biosequence labeling problem the GRHCRF performs better than CRF models of the same complexity, indicating that GRHCRFs can be useful tools for biosequence analysis applications.AvailabilityGRHCRF software is available under GPLv3 licence at the website http://www.biocomp.unibo.it/~savojard/biocrf-0.9.tar.gz

BMC Bioinformatics | 2013

Prediction of disulfide connectivity in proteins with machine-learning methods and correlated mutations

Castrense Savojardo; Piero Fariselli; Pier Luigi Martelli; Rita Casadio

BackgroundRecently, information derived by correlated mutations in proteins has regained relevance for predicting protein contacts. This is due to new forms of mutual information analysis that have been proven to be more suitable to highlight direct coupling between pairs of residues in protein structures and to the large number of protein chains that are currently available for statistical validation. It was previously discussed that disulfide bond topology in proteins is also constrained by correlated mutations.ResultsIn this paper we exploit information derived from a corrected mutual information analysis and from the inverse of the covariance matrix to address the problem of the prediction of the topology of disulfide bonds in Eukaryotes. Recently, we have shown that Support Vector Regression (SVR) can improve the prediction for the disulfide connectivity patterns. Here we show that the inclusion of the correlated mutation information increases of 5 percentage points the SVR performance (from 54% to 59%). When this approach is used in combination with a method previously developed by us and scoring at the state of art in predicting both location and topology of disulfide bonds in Eukaryotes (DisLocate), the per-protein accuracy is 38%, 2 percentage points higher than that previously obtained.ConclusionsIn this paper we show that the inclusion of information derived from correlated mutations can improve the performance of the state of the art methods for predicting disulfide connectivity patterns in Eukaryotic proteins. Our analysis also provides support to the notion that improving methods to extract evolutionary information from multiple sequence alignments greatly contributes to the scoring performance of predictors suited to detect relevant features from protein chains.

Explore More