Alberto J. M. Martin
University of Padua
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alberto J. M. Martin.
Bioinformatics | 2012
Ian Walsh; Alberto J. M. Martin; Tomás Di Domenico
MOTIVATIONnIntrinsically disordered regions are key for the function of numerous proteins, and the scant available experimental annotations suggest the existence of different disorder flavors. While efficient predictions are required to annotate entire genomes, most existing methods require sequence profiles for disorder prediction, making them cumbersome for high-throughput applications.nnnRESULTSnIn this work, we present an ensemble of protein disorder predictors called ESpritz. These are based on bidirectional recursive neural networks and trained on three different flavors of disorder, including a novel NMR flexibility predictor. ESpritz can produce fast and accurate sequence-only predictions, annotating entire genomes in the order of hours on a single processor core. Alternatively, a slower but slightly more accurate ESpritz variant using sequence profiles can be used for applications requiring maximum performance. Two levels of prediction confidence allow either to maximize reasonable disorder detection or to limit expected false positives to 5%. ESpritz performs consistently well on the recent CASP9 data, reaching a S(w) measure of 54.82 and area under the receiver operator curve of 0.856. The fast predictor is four orders of magnitude faster and remains better than most publicly available CASP9 methods, making it ideal for genomic scale predictions.nnnCONCLUSIONSnESpritz predicts three flavors of disorder at two distinct false positive rates, either with a fast or slower and slightly more accurate approach. Given its state-of-the-art performance, it can be especially useful for high-throughput applications.nnnAVAILABILITYnBoth a web server for high-throughput analysis and a Linux executable version of ESpritz are available from: http://protein.bio.unipd.it/espritz/.
Bioinformatics | 2012
Tomás Di Domenico; Ian Walsh; Alberto J. M. Martin
MOTIVATIONnDisordered protein regions are key to the function of numerous processes within an organism and to the determination of a proteins biological role. The most common source for protein disorder annotations, DisProt, covers only a fraction of the available sequences. Alternatively, the Protein Data Bank (PDB) has been mined for missing residues in X-ray crystallographic structures. Herein, we provide a centralized source for data on different flavours of disorder in protein structures, MobiDB, building on and expanding the content provided by already existing sources. In addition to the DisProt and PDB X-ray structures, we have added experimental information from NMR structures and five different flavours of two disorder predictors (ESpritz and IUpred). These are combined into a weighted consensus disorder used to classify disordered regions into flexible and constrained disorder. Users are encouraged to submit manual annotations through a submission form. MobiDB features experimental annotations for 17 285 proteins, covering the entire PDB and predictions for the SwissProt database, with 565 200 annotated sequences. Depending on the disorder flavour, 6-20% of the residues are predicted as disordered.nnnAVAILABILITYnThe database is freely available at http://mobidb.bio.unipd.it/[email protected].
BMC Bioinformatics | 2007
Gianluca Pollastri; Alberto J. M. Martin; Catherine Mooney; Alessandro Vullo
BackgroundStructural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio.ResultsHere we develop high-throughput machine learning systems for the prediction of protein secondary structure and solvent accessibility that exploit homology to proteins of known structure, where available, in the form of simple structural frequency profiles extracted from sets of PDB templates. We compare these systems to their state-of-the-art ab initio counterparts, and with a number of baselines in which secondary structures and solvent accessibilities are extracted directly from the templates. We show that structural information from templates greatly improves secondary structure and solvent accessibility prediction quality, and that, on average, the systems significantly enrich the information contained in the templates. For sequence similarity exceeding 30%, secondary structure prediction quality is approximately 90%, close to its theoretical maximum, and 2-class solvent accessibility roughly 85%. Gains are robust with respect to template selection noise, and significant for marginal sequence similarity and for short alignments, supporting the claim that these improved predictions may prove beneficial beyond the case in which clear homology is available.ConclusionThe predictive system are publicly available at the address http://distill.ucd.ie.
BMC Bioinformatics | 2006
Davide Baù; Alberto J. M. Martin; Catherine Mooney; Alessandro Vullo; Ian Walsh; Gianluca Pollastri
BackgroundWe describe Distill, a suite of servers for the prediction of protein structural features: secondary structure; relative solvent accessibility; contact density; backbone structural motifs; residue contact maps at 6, 8 and 12 Angstrom; coarse protein topology. The servers are based on large-scale ensembles of recursive neural networks and trained on large, up-to-date, non-redundant subsets of the Protein Data Bank. Together with structural feature predictions, Distill includes a server for prediction of Cαtraces for short proteins (up to 200 amino acids).ResultsThe servers are state-of-the-art, with secondary structure predicted correctly for nearly 80% of residues (currently the top performance on EVA), 2-class solvent accessibility nearly 80% correct, and contact maps exceeding 50% precision on the top non-diagonal contacts. A preliminary implementation of the predictor of protein Cαtraces featured among the top 20 Novel Fold predictors at the last CASP6 experiment as group Distill (ID 0348). The majority of the servers, including the Cαtrace predictor, now take into account homology information from the PDB, when available, resulting in greatly improved reliability.ConclusionAll predictions are freely available through a simple joint web interface and the results are returned by email. In a single submission the user can send protein sequences for a total of up to 32k residues to all or a selection of the servers. Distill is accessible at the address: http://distill.ucd.ie/distill/.
Bioinformatics | 2011
Alberto J. M. Martin; Michele Vidotto; Filippo Boscariol; Tomás Di Domenico; Ian Walsh
MOTIVATIONnResidue interaction networks (RINs) have been used in the literature to describe the protein 3D structure as a graph where nodes represent residues and edges physico-chemical interactions, e.g. hydrogen bonds or van-der-Waals contacts. Topological network parameters can be calculated over RINs and have been correlated with various aspects of protein structure and function. Here we present a novel web server, RING, to construct physico-chemically valid RINs interactively from PDB files for subsequent visualization in the Cytoscape platform. The additional structure-based parameters secondary structure, solvent accessibility and experimental uncertainty can be combined with information regarding residue conservation, mutual information and residue-based energy scoring functions. Different visualization styles are provided to facilitate visualization and standard plugins can be used to calculate topological parameters in Cytoscape. A sample use case analyzing the active site of glutathione peroxidase is presented.nnnAVAILABILITYnThe RING server, supplementary methods, examples and tutorials are available for non-commercial use at URL: http://protein.bio.unipd.it/ring/.
Nucleic Acids Research | 2011
Ian Walsh; Alberto J. M. Martin; Tomás Di Domenico; Alessandro Vullo; Gianluca Pollastri
CSpritz is a web server for the prediction of intrinsic protein disorder. It is a combination of previous Spritz with two novel orthogonal systems developed by our group (Punch and ESpritz). Punch is based on sequence and structural templates trained with support vector machines. ESpritz is an efficient single sequence method based on bidirectional recursive neural networks. Spritz was extended to filter predictions based on structural homologues. After extensive testing, predictions are combined by averaging their probabilities. The CSpritz website can elaborate single or multiple predictions for either short or long disorder. The server provides a global output page, for download and simultaneous statistics of all predictions. Links are provided to each individual protein where the amino acid sequence and disorder prediction are displayed along with statistics for the individual protein. As a novel feature, CSpritz provides information about structural homologues as well as secondary structure and short functional linear motifs in each disordered segment. Benchmarking was performed on the very recent CASP9 data, where CSpritz would have ranked consistently well with a Sw measure of 49.27 and AUC of 0.828. The server, together with help and methods pages including examples, are freely available at URL: http://protein.bio.unipd.it/cspritz/.
BMC Genomics | 2014
Manuel Giollo; Alberto J. M. Martin; Ian Walsh; Carlo Ferrari
BackgroundThe rapid growth of un-annotated missense variants poses challenges requiring novel strategies for their interpretation. From the thermodynamic point of view, amino acid changes can lead to a change in the internal energy of a protein and induce structural rearrangements. This is of great relevance for the study of diseases and protein design, justifying the development of prediction methods for variant-induced stability changes.ResultsHere we propose NeEMO, a tool for the evaluation of stability changes using an effective representation of proteins based on residue interaction networks (RINs). RINs are used to extract useful features describing interactions of the mutant amino acid with its structural environment. Benchmarking shows NeEMO to be very effective, allowing reliable predictions in different parts of the protein such as β-strands and buried residues. Validation on a previously published independent dataset shows that NeEMO has a Pearson correlation coefficient of 0.77 and a standard error of 1 Kcal/mol, outperforming nine recent methods. The NeEMO web server can be freely accessed from URL: http://protein.bio.unipd.it/neemo/.ConclusionsNeEMO offers an innovative and reliable tool for the annotation of amino acid changes. A key contribution are RINs, which can be used for modeling proteins and their interactions effectively. Interestingly, the approach is very general, and can motivate the development of a new family of RIN-based protein structure analyzers. NeEMO may suggest innovative strategies for bioinformatics tools beyond protein stability prediction.
BMC Structural Biology | 2009
Ian Walsh; Davide Baù; Alberto J. M. Martin; Catherine Mooney; Alessandro Vullo; Gianluca Pollastri
BackgroundPrediction of protein structures from their sequences is still one of the open grand challenges of computational biology. Some approaches to protein structure prediction, especially ab initio ones, rely to some extent on the prediction of residue contact maps. Residue contact map predictions have been assessed at the CASP competition for several years now. Although it has been shown that exact contact maps generally yield correct three-dimensional structures, this is true only at a relatively low resolution (3–4 Å from the native structure). Another known weakness of contact maps is that they are generally predicted ab initio, that is not exploiting information about potential homologues of known structure.ResultsWe introduce a new class of distance restraints for protein structures: multi-class distance maps. We show that Cαtrace reconstructions based on 4-class native maps are significantly better than those from residue contact maps. We then build two predictors of 4-class maps based on recursive neural networks: one ab initio, or relying on the sequence and on evolutionary information; one template-based, or in which homology information to known structures is provided as a further input. We show that virtually any level of sequence similarity to structural templates (down to less than 10%) yields more accurate 4-class maps than the ab initio predictor. We show that template-based predictions by recursive neural networks are consistently better than the best template and than a number of combinations of the best available templates. We also extract binary residue contact maps at an 8 Å threshold (as per CASP assessment) from the 4-class predictors and show that the template-based version is also more accurate than the best template and consistently better than the ab initio one, down to very low levels of sequence identity to structural templates. Furthermore, we test both ab-initio and template-based 8 Å predictions on the CASP7 targets using a pre-CASP7 PDB, and find that both predictors are state-of-the-art, with the template-based one far outperforming the best CASP7 systems if templates with sequence identity to the query of 10% or better are available. Although this is not the main focus of this paper we also report on reconstructions of Cαtraces based on both ab initio and template-based 4-class map predictions, showing that the latter are generally more accurate even when homology is dubious.ConclusionAccurate predictions of multi-class maps may provide valuable constraints for improved ab initio and template-based prediction of protein structures, naturally incorporate multiple templates, and yield state-of-the-art binary maps. Predictions of protein structures and 8 Å contact maps based on the multi-class distance map predictors described in this paper are freely available to academic users at the url http://distill.ucd.ie/.
Proceedings of the National Academy of Sciences of the United States of America | 2013
Gabriella Mazzotta; Alessandro Rossi; Emanuela Leonardi; Moyra Mason; Cristiano Bertolucci; Laura Caccin; Barbara Spolaore; Alberto J. M. Martin; Matthias Schlichting; Rudi Grebler; Charlotte Helfrich-Förster; Stefano Mammi; Rodolfo Costa
Cryptochromes are flavoproteins, structurally and evolutionarily related to photolyases, that are involved in the development, magnetoreception, and temporal organization of a variety of organisms. Drosophila CRYPTOCHROME (dCRY) is involved in light synchronization of the master circadian clock, and its C terminus plays an important role in modulating light sensitivity and activity of the protein. The activation of dCRY by light requires a conformational change, but it has been suggested that activation could be mediated also by specific “regulators” that bind the C terminus of the protein. This C-terminal region harbors several protein–protein interaction motifs, likely relevant for signal transduction regulation. Here, we show that some functional linear motifs are evolutionarily conserved in the C terminus of cryptochromes and that class III PDZ-binding sites are selectively maintained in animals. A coimmunoprecipitation assay followed by mass spectrometry analysis revealed that dCRY interacts with Retinal Degeneration A (RDGA) and with Neither Inactivation Nor Afterpotential C (NINAC) proteins. Both proteins belong to a multiprotein complex (the Signalplex) that includes visual-signaling molecules. Using bioinformatic and molecular approaches, dCRY was found to interact with Neither Inactivation Nor Afterpotential C through Inactivation No Afterpotential D (INAD) in a light-dependent manner and that the CRY–Inactivation No Afterpotential D interaction is mediated by specific domains of the two proteins and involves the CRY C terminus. Moreover, an impairment of the visual behavior was observed in fly mutants for dCRY, indicative of a role, direct or indirect, for this photoreceptor in fly vision.
Bioinformatics | 2010
Alberto J. M. Martin; Ian Walsh
MOTIVATIONnMOBI is a web server for the identification of structurally mobile regions in NMR protein ensembles. It provides a binary mobility definition that is analogous to the commonly used definition of intrinsic disorder in X-ray crystallographic structures. At least three different use cases can be envisaged: (i) visualization of NMR mobility for structural analysis; (ii) definition of regions for reliable comparative modelling in protein structure prediction and (iii) definition of mobility in analogy to intrinsic disorder. MOBI uses structural superposition and local conformational differences to derive a robust binary mobility definition that is in excellent agreement with the manually curated definition used in the CASP8 experiment for intrinsic disorder in NMR structure. The output includes mobility-coloured PDB files, mobility plots and a FASTA formatted sequence file summarizing the mobility results.nnnAVAILABILITYnThe MOBI server and supplementary methods are available for non-commercial use at URL: http://protein.bio.unipd.it/mobi/.