Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where P. Fariselli is active.

Publication


Featured researches published by P. Fariselli.


Proteins | 2002

Prediction of coordination number and relative solvent accessibility in proteins

Gianluca Pollastri; Pierre Baldi; P. Fariselli; Rita Casadio

Knowing the coordination number and relative solvent accessibility of all the residues in a protein is crucial for deriving constraints useful in modeling protein folding and protein structure and in scoring remote homology searches. We develop ensembles of bidirectional recurrent neural network architectures to improve the state of the art in both contact and accessibility prediction, leveraging a large corpus of curated data together with evolutionary information. The ensembles are used to discriminate between two different states of residue contacts or relative solvent accessibility, higher or lower than a threshold determined by the average value of the residue distribution or the accessibility cutoff. For coordination numbers, the ensemble achieves performances ranging within 70.6–73.9% depending on the radius adopted to discriminate contacts (6Å–12Å). These performances represent gains of 16–20% over the baseline statistical predictor, always assigning an amino acid to the largest class, and are 4–7% better than any previous method. A combination of different radius predictors further improves performance. For accessibility thresholds in the relevant 15–30% range, the ensemble consistently achieves a performance above 77%, which is 10–16% above the baseline prediction and better than other existing predictors, by up to several percentage points. For both problems, we quantify the improvement due to evolutionary information in the form of PSI‐BLAST‐generated profiles over BLAST profiles. The prediction programs are implemented in the form of two web servers, CONpro and ACCpro, available at http://promoter.ics.uci.edu/BRNN‐PRED/. Proteins 2002;47:142–153.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2008

Reconstruction of 3D Structures From Protein Contact Maps

Marco Vassura; Luciano Margara; P. Di Lena; Filippo Medri; P. Fariselli; Rita Casadio

The prediction of the protein tertiary structure from solely its residue sequence (the so called Protein Folding Problem) is one of the most challenging problems in Structural Bioinformatics. We focus on the protein residue contact map. When this map is assigned it is possible to reconstruct the 3D structure of the protein backbone. The general problem of recovering a set of 3D coordinates consistent with some given contact map is known as a unit-disk-graph realization problem and it has been recently proven to be NP-Hard. In this paper we describe a heuristic method (COMAR) that is able to reconstruct with an unprecedented rate (3-15 seconds) a 3D model that exactly matches the target contact map of a protein. Working with a non-redundant set of 1760 proteins, we find that the scoring efficiency of finding a 3D model very close to the protein native structure depends on the threshold value adopted to compute the protein residue contact map. Contact maps whose threshold values range from 10 to 18 Aringngstroms allow reconstructing 3D models that are very similar to the proteins native structure.


Proteins | 2003

A neural network approach to evaluate fold recognition results

David Juan; Osvaldo Graña; Florencio Pazos; P. Fariselli; Rita Casadio; Alfonso Valencia

Fold recognition techniques assist the exploration of protein structures, and web‐based servers are part of the standard set of tools used in the analysis of biochemical problems. Despite their success, current methods are only able to predict the correct fold in a relatively small number of cases. We propose an approach that improves the selection of correct folds from among the results of two methods implemented as web servers (SAMT99 and 3DPSSM). Our approach is based on the training of a system of neural networks with models generated by the servers and a set of associated characteristics such as the quality of the sequence‐structure alignment, distribution of sequence features (sequence‐conserved positions and apolar residues), and compactness of the resulting models. Our results show that it is possible to detect adequate folds to model 80% of the sequences with a high level of confidence. The improvements achieved by taking into account sequence characteristics open the door to future improvements by directly including such factors in the step of model generation. This approach has been implemented as an automatic system LIBELLULA, available as a public web server at http://www.pdg.cnb.uam.es/servers/libellula.html. Proteins 2003;50:600–608.


Gene | 1998

CAN FUNCTIONAL REGIONS OF PROTEINS BE PREDICTED FROM THEIR CODING SEQUENCES? THE CASE STUDY OF G-PROTEIN COUPLED RECEPTORS

P Arrigo; P. Fariselli; Rita Casadio

A filter based on a set of unsupervised neural networks trained with a winner-take-all strategy discloses signals along the coding sequences of G-protein coupled receptors. By comparing with the existing experimental data it appears that these signals correlate with putative functional domains of the proteins. After protein alignment within subfamilies, signals cluster in protein regions which, according to the presently available experimental results, are described as possible functional domains of the folded proteins. The mapping procedure reveals characteristic regions in the coding sequences common and/or characteristic of the receptor subtype. This is particularly noticeable for the third cytoplasmic loop, which is likely to be involved in the molecular coupling of all the subfamilies with G-proteins. The results indicate that our mapping can highlight intrinsic representative features of the coding sequences which, in the case of G-protein coupled receptors, are characteristic of protein functional regions and suggest a possible application of the filter for predicting functional determinants in proteins starting from the coding sequence.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2011

Is There an Optimal Substitution Matrix for Contact Prediction with Correlated Mutations

P. Di Lena; P. Fariselli; Luciano Margara; Marco Vassura; Rita Casadio

Correlated mutations in proteins are believed to occur in order to preserve the protein functional folding through evolution. Their values can be deduced from sequence and/or structural alignments and are indicative of residue contacts in the protein three-dimensional structure. A correlation among pairs of residues is routinely evaluated with the Pearson correlation coefficient and the MCLACHLAN similarity matrix. In literature, there is no justification for the adoption of the MCLACHLAN instead of other substitution matrices. In this paper, we approach the problem of computing the optimal similarity matrix for contact prediction with correlated mutations, i.e., the similarity matrix that maximizes the accuracy of contact prediction with correlated mutations. We describe an optimization procedure, based on the gradient descent method, for computing the optimal similarity matrix and perform an extensive number of experimental tests. Our tests show that there is a large number of optimal matrices that perform similarly to MCLACHLAN. We also obtain that the upper limit to the accuracy achievable in protein contact prediction is independent of the optimized similarity matrix. This suggests that the poor scoring of the correlated mutations approach may be due to the choice of the linear correlation function in evaluating correlated mutations.


Current Protein & Peptide Science | 2010

The Prediction of Protein-Protein Interacting Sites in Genome-Wide Protein Interaction Networks: The Test Case of the Human Cell Cycle

Lisa Bartoli; Pier Luigi Martelli; Ivan Rossi; P. Fariselli; Rita Casadio

In this paper we aim at investigating possible correlations between the number of putative interaction patches of a given protein, as inferred by an algorithm that we have developed, and its degree (number of edges of the protein node in a protein interaction network). We focus on the human cell cycle that, as compared with other biological processes, comprises the largest number of proteins whose structure is known at atomic resolution both as monomers and as interacting complexes. For predicting interaction patches we specifically develop a HM-SVM based method reaching 71% overall accuracy with a correlation coefficient value equal to 0.43 on a non redundant set of protein complexes. To test the biological meaning of our predictions, we also explore whether interacting patches contain energetically important residues and/or disease related mutations and find that predicted patches are endowed with both features. Based on this, we propose that mapping the protein with all the predicted interaction patches bridges the molecule to the interactome at the cell level. To test our hypothesis we downloaded interaction data from interaction data bases and find that the number of predicted interaction patches significantly correlates (Pearson correlation value >0.3) with the number of the known interactions (edges) per protein in the human interactome, as contained in MINT and IntAct. We also show that the correlation increases (Pearson correlation value >0.5) when the subcellular co-localization and the co-expression levels of the interacting partners are taken into account.


Sar and Qsar in Environmental Research | 2000

Neural networks predict protein folding and structure: artificial intelligence faces biomolecular complexity.

Rita Casadio; Mario Compiani; P. Fariselli; Irene Jacoboni; Pier Luigi Martelli

Abstract In the genomic era DNA sequencing is increasing our knowledge of the molecular structure of genetic codes from bacteria to man at a hyperbolic rate. Billions of nucleotides and millions of aminoacids are already filling the electronic files of the data bases presently available, which contain a tremendous amount of information on the most biologically relevant macromolecules, such as DNA. RNA and proteins. The most urgent problem originates from the need to single out the relevant information amidst a wealth of general features. Intelligent tools are therefore needed to optimise the search. Data mining for sequence analysis in biotechnology has been substantially aided by the development of new powerful methods borrowed from the machine learning approach. In this paper we discuss the application of artificial feedforward neural networks to deal with some fundamental problems tied with the folding process and the structure-function relationship in proteins.


Sar and Qsar in Environmental Research | 2002

Protein structure prediction and biomolecular recognition: From protein sequence to peptidomimetic design with the human β 3 integrin

Rita Casadio; Mario Compiani; A. Facchiano; P. Fariselli; Pier Luigi Martelli; Irene Jacoboni; Ivan Rossi

Computational tools can bridge the gap between sequence and protein 3D structure based on the notion that information is to be retrieved from the databases and that knowledge-based methods can help in approaching a solution of the protein-folding problem. To this aim our group has implemented neural network-based predictors capable of performing with some success in different tasks, including predictions of the secondary structure of globular and membrane proteins, the topology of membrane proteins and porins and stable f -helical segments suited for protein design. Moreover we have developed methods for predicting contact maps in proteins and the probability of finding a cysteine in a disulfide bridge, tools which can contribute to the goal of predicting the 3D structure starting from the sequence (the so called ab initio prediction). All our predictors take advantage of evolution information derived from the structural alignments of homologous (evolutionary related) proteins and taken from the sequence and structure databases. When it is necessary to build models for proteins of unknown spatial structure, which have very little homology with other proteins of known structure, non-standard techniques need to be developed and the tools for protein structure predictions may help in protein modeling. The results of a recent simulation performed in our lab highlights the role of high performing computing technology and of tools of computational biology in protein modeling and peptidomimetic design.


Human Mutation | 2017

Performance of in silico tools for the evaluation of p16INK4a (CDKN2A) variants in CAGI

Marco Carraro; Giovanni Minervini; Manuel Giollo; Yana Bromberg; Emidio Capriotti; Rita Casadio; Roland L. Dunbrack; Lisa Elefanti; P. Fariselli; Carlo Ferrari; Julian Gough; Panagiotis Katsonis; Emanuela Leonardi; Olivier Lichtarge; Chiara Menin; Pier Luigi Martelli; Abhishek Niroula; Lipika R. Pal; Susanna Repo; Maria Chiara Scaini; Mauno Vihinen; Qiong Wei; Qifang Xu; Yuedong Yang; Yizhou Yin; Jan Zaucha; Huiying Zhao; Yaoqi Zhou; Steven E. Brenner; John Moult

Correct phenotypic interpretation of variants of unknown significance for cancer‐associated genes is a diagnostic challenge as genetic screenings gain in popularity in the next‐generation sequencing era. The Critical Assessment of Genome Interpretation (CAGI) experiment aims to test and define the state of the art of genotype–phenotype interpretation. Here, we present the assessment of the CAGI p16INK4a challenge. Participants were asked to predict the effect on cellular proliferation of 10 variants for the p16INK4a tumor suppressor, a cyclin‐dependent kinase inhibitor encoded by the CDKN2A gene. Twenty‐two pathogenicity predictors were assessed with a variety of accuracy measures for reliability in a medical context. Different assessment measures were combined in an overall ranking to provide more robust results. The R scripts used for assessment are publicly available from a GitHub repository for future use in similar assessment exercises. Despite a limited test‐set size, our findings show a variety of results, with some methods performing significantly better. Methods combining different strategies frequently outperform simpler approaches. The best predictor, Yang&Zhou lab, uses a machine learning method combining an empirical energy function measuring protein stability with an evolutionary conservation term. The p16INK4a challenge highlights how subtle structural effects can neutralize otherwise deleterious variants.


Archive | 2008

The state of the art of membrane protein structure prediction: from sequence to 3D structure

Rita Casadio; P. Fariselli; Pier Luigi Martelli; Andrea Pierleoni; Ivan Rossi; G. von Heijne

Membrane proteins constitute a very large set of yet-to-be characterized proteins mediating all the relevant life-related functions both in prokaryotes and eukaryotes. Estimates are suggesting that in whole genomes the content of this protein type may vary from 10 to 40% of the whole proteome, depending on the organism.

Collaboration


Dive into the P. Fariselli's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Pierre Baldi

University of California

View shared research outputs
Researchain Logo
Decentralizing Knowledge