Daniele Raimondi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniele Raimondi is active.

Explore More

Publication

Featured researches published by Daniele Raimondi.

european conference on computational biology | 2016

Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects

Daniele Raimondi; Andrea Gazzo; Marianne Rooman; Tom Lenaerts; Wim F. Vranken

MOTIVATION There are now many predictors capable of identifying the likely phenotypic effects of single nucleotide variants (SNVs) or short in-frame Insertions or Deletions (INDELs) on the increasing amount of genome sequence data. Most of these predictors focus on SNVs and use a combination of features related to sequence conservation, biophysical, and/or structural properties to link the observed variant to either neutral or disease phenotype. Despite notable successes, the mapping between genetic variants and their phenotypic effects is riddled with levels of complexity that are not yet fully understood and that are often not taken into account in the predictions, despite their promise of significantly improving the prediction of deleterious mutants. RESULTS We present DEOGEN, a novel variant effect predictor that can handle both missense SNVs and in-frame INDELs. By integrating information from different biological scales and mimicking the complex mixture of effects that lead from the variant to the phenotype, we obtain significant improvements in the variant-effect prediction results. Next to the typical variant-oriented features based on the evolutionary conservation of the mutated positions, we added a collection of protein-oriented features that are based on functional aspects of the gene affected. We cross-validated DEOGEN on 36 825 polymorphisms, 20 821 deleterious SNVs, and 1038 INDELs from SwissProt. The multilevel contextualization of each (variant, protein) pair in DEOGEN provides a 10% improvement of MCC with respect to current state-of-the-art tools. AVAILABILITY AND IMPLEMENTATION The software and the data presented here is publicly available at http://ibsquare.be/deogen CONTACT : [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

Bioinformatics | 2015

Clustering-based model of cysteine co-evolution improves disulfide bond connectivity prediction and reduces homologous sequence requirements

Daniele Raimondi; Gabriele Orlando; Wim F. Vranken

MOTIVATION Cysteine residues have particular structural and functional relevance in proteins because of their ability to form covalent disulfide bonds. Bioinformatics tools that can accurately predict cysteine bonding states are already available, whereas it remains challenging to infer the disulfide connectivity pattern of unknown protein sequences. Improving accuracy in this area is highly relevant for the structural and functional annotation of proteins. RESULTS We predict the intra-chain disulfide bond connectivity patterns starting from known cysteine bonding states with an evolutionary-based unsupervised approach called Sephiroth that relies on high-quality alignments obtained with HHblits and is based on a coarse-grained cluster-based modelization of tandem cysteine mutations within a protein family. We compared our method with state-of-the-art unsupervised predictors and achieve a performance improvement of 25-27% while requiring an order of magnitude less of aligned homologous sequences (∼10(3) instead of ∼10(4)). AVAILABILITY AND IMPLEMENTATION The software described in this article and the datasets used are available at http://ibsquare.be/sephiroth. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary material is available at Bioinformatics online.

Biophysical Journal | 2016

Early Folding Events, Local Interactions, and Conservation of Protein Backbone Rigidity

Rita Pancsa; Daniele Raimondi; Elisa Cilia; Wim F. Vranken

Protein folding is in its early stages largely determined by the protein sequence and complex local interactions between amino acids, resulting in lower energy conformations that provide the context for further folding into the native state. We compiled a comprehensive data set of early folding residues based on pulsed labeling hydrogen deuterium exchange experiments. These early folding residues have corresponding higher backbone rigidity as predicted by DynaMine from sequence, an effect also present when accounting for the secondary structures in the folded protein. We then show that the amino acids involved in early folding events are not more conserved than others, but rather, early folding fragments and the secondary structure elements they are part of show a clear trend toward conserving a rigid backbone. We therefore propose that backbone rigidity is a fundamental physical feature conserved by proteins that can provide important insights into their folding mechanisms and stability.

Scientific Reports | 2016

Observation selection bias in contact prediction and its implications for structural bioinformatics

Gabriele Orlando; Daniele Raimondi; Wim F. Vranken

Next Generation Sequencing is dramatically increasing the number of known protein sequences, with related experimentally determined protein structures lagging behind. Structural bioinformatics is attempting to close this gap by developing approaches that predict structure-level characteristics for uncharacterized protein sequences, with most of the developed methods relying heavily on evolutionary information collected from homologous sequences. Here we show that there is a substantial observational selection bias in this approach: the predictions are validated on proteins with known structures from the PDB, but exactly for those proteins significantly more homologs are available compared to less studied sequences randomly extracted from Uniprot. Structural bioinformatics methods that were developed this way are thus likely to have over-estimated performances; we demonstrate this for two contact prediction methods, where performances drop up to 60% when taking into account a more realistic amount of evolutionary information. We provide a bias-free dataset for the validation for contact prediction methods called NOUMENON.

PLOS ONE | 2015

An evolutionary view on Disulfide bond connectivities prediction using phylogenetic trees and a simple cysteine mutation model

Daniele Raimondi; Gabriele Orlando; Wim F. Vranken

Disulfide bonds are crucial for many structural and functional aspects of proteins. They have a stabilizing role during folding, can regulate enzymatic activity and can trigger allosteric changes in the protein structure. Moreover, knowledge of the topology of the disulfide connectivity can be relevant in genomic annotation tasks and can provide long range constraints for ab-initio protein structure predictors. In this paper we describe PhyloCys, a novel unsupervised predictor of disulfide bond connectivity from known cysteine oxidation states. For each query protein, PhyloCys retrieves and aligns homologs with HHblits and builds a phylogenetic tree using ClustalW. A simplified model of cysteine co-evolution is then applied to the tree in order to hypothesize the presence of oxidized cysteines in the inner nodes of the tree, which represent ancestral protein sequences. The tree is then traversed from the leaves to the root and the putative disulfide connectivity is inferred by observing repeated patterns of tandem mutations between a sequence and its ancestors. A final correction is applied using the Edmonds-Gabow maximum weight perfect matching algorithm. The evolutionary approach applied in PhyloCys results in disulfide bond predictions equivalent to Sephiroth, another approach that takes whole sequence information into account, and is 26–29% better than state of the art methods based on cysteine covariance patterns in multiple sequence alignments, while requiring one order of magnitude fewer homologous sequences (103 instead of 104), thus extending its range of applicability. The software described in this article and the datasets used are available at http://ibsquare.be/phylocys.

Human Mutation | 2017

Investigating The Molecular Mechanisms Behind Uncharacterized Cysteine Losses from Prediction of Their Oxidation State

Daniele Raimondi; Gabriele Orlando; Joris Messens; Wim F. Vranken

Cysteines are among the rarest amino acids in nature, and are both functionally and structurally very important for proteins. The ability of cysteines to form disulfide bonds is especially relevant, both for constraining the folded state of the protein and for performing enzymatic duties. But how does the variation record of human proteins reflect their functional importance and structural role, especially with regard to deleterious mutations? We created HUMCYS, a manually curated dataset of single amino acid variants that (1) have a known disease/neutral phenotypic outcome and (2) cause the loss of a cysteine, in order to investigate how mutated cysteines relate to structural aspects such as surface accessibility and cysteine oxidation state. We also have developed a sequence‐based in silico cysteine oxidation predictor to overcome the scarcity of experimentally derived oxidation annotations, and applied it to extend our analysis to classes of proteins for which the experimental determination of their structure is technically challenging, such as transmembrane proteins. Our investigation shows that we can gain insights into the reason behind the outcome of cysteine losses in otherwise uncharacterized proteins, and we discuss the possible molecular mechanisms leading to deleterious phenotypes, such as the involvement of the mutated cysteine in a structurally or enzymatically relevant disulfide bond.

Scientific Reports | 2017

Exploring the Sequence-based Prediction of Folding Initiation Sites in Proteins

Daniele Raimondi; Gabriele Orlando; Rita Pancsa; Taushif Khan; Wim F. Vranken

Protein folding is a complex process that can lead to disease when it fails. Especially poorly understood are the very early stages of protein folding, which are likely defined by intrinsic local interactions between amino acids close to each other in the protein sequence. We here present EFoldMine, a method that predicts, from the primary amino acid sequence of a protein, which amino acids are likely involved in early folding events. The method is based on early folding data from hydrogen deuterium exchange (HDX) data from NMR pulsed labelling experiments, and uses backbone and sidechain dynamics as well as secondary structure propensities as features. The EFoldMine predictions give insights into the folding process, as illustrated by a qualitative comparison with independent experimental observations. Furthermore, on a quantitative proteome scale, the predicted early folding residues tend to become the residues that interact the most in the folded structure, and they are often residues that display evolutionary covariation. The connection of the EFoldMine predictions with both folding pathway data and the folded protein structure suggests that the initial statistical behavior of the protein chain with respect to local structure formation has a lasting effect on its subsequent states.

Nucleic Acids Research | 2017

DEOGEN2: prediction and interactive visualization of single amino acid variant deleteriousness in human proteins.

Daniele Raimondi; Ibrahim Tanyalcin; Julien Ferté; Andrea Gazzo; Gabriele Orlando; Tom Lenaerts; Marianne Rooman; Wim F. Vranken

Abstract High-throughput sequencing methods are generating enormous amounts of genomic data, giving unprecedented insights into human genetic variation and its relation to disease. An individual human genome contains millions of Single Nucleotide Variants: to discriminate the deleterious from the benign ones, a variety of methods have been developed that predict whether a protein-coding variant likely affects the carrier individuals health. We present such a method, DEOGEN2, which incorporates heterogeneous information about the molecular effects of the variants, the domains involved, the relevance of the gene and the interactions in which it participates. This extensive contextual information is non-linearly mapped into one single deleteriousness score for each variant. Since for the non-expert user it is sometimes still difficult to assess what this score means, how it relates to the encoded protein, and where it originates from, we developed an interactive online framework (http://deogen2.mutaframe.com/) to better present the DEOGEN2 deleteriousness predictions of all possible variants in all human proteins. The prediction is visualized so both expert and non-expert users can gain insights into the meaning, protein context and origins of each prediction.

Bioinformatics | 2017

SVM-dependent pairwise HMM: an application to protein pairwise alignments

Gabriele Orlando; Daniele Raimondi; Taushif Khan; Tom Lenaerts; Wim F. Vranken

Motivation: Methods able to provide reliable protein alignments are crucial for many bioinformatics applications. In the last years many different algorithms have been developed and various kinds of information, from sequence conservation to secondary structure, have been used to improve the alignment performances. This is especially relevant for proteins with highly divergent sequences. However, recent works suggest that different features may have different importance in diverse protein classes and it would be an advantage to have more customizable approaches, capable to deal with different alignment definitions. Results: Here we present Rigapollo, a highly flexible pairwise alignment method based on a pairwise HMM‐SVM that can use any type of information to build alignments. Rigapollo lets the user decide the optimal features to align their protein class of interest. It outperforms current state of the art methods on two well‐known benchmark datasets when aligning highly divergent sequences. Availability and implementation: A Python implementation of the algorithm is available at http://ibsquare.be/rigapollo. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Bioinformatics | 2018

Ultra-fast global homology detection with Discrete Cosine Transform and Dynamic Time Warping

Daniele Raimondi; Gabriele Orlando; Yves Moreau; Wim F. Vranken

Motivation Evolutionary information is crucial for the annotation of proteins in bioinformatics. The amount of retrieved homologs often correlates with the quality of predicted protein annotations related to structure or function. With a growing amount of sequences available, fast and reliable methods for homology detection are essential, as they have a direct impact on predicted protein annotations. Results We developed a discriminative, alignment-free algorithm for homology detection with quasi-linear complexity, enabling theoretically much faster homology searches. To reach this goal, we convert the protein sequence into numeric biophysical representations. These are shrunk to a fixed length using a novel vector quantization method which uses a Discrete Cosine Transform compression. We then compute, for each compressed representation, similarity scores between proteins with the Dynamic Time Warping algorithm and we feed them into a Random Forest. The WARP performances are comparable with state of the art methods. Availability and implementation The method is available at http://ibsquare.be/warp. Supplementary information Supplementary data are available at Bioinformatics online.

Explore More