Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michele Donato is active.

Publication


Featured researches published by Michele Donato.


Frontiers in Physiology | 2013

Methods and approaches in the topology-based analysis of biological pathways

Cristina Mitrea; Zeinab Taghavi; Behzad Bokanizad; Samer Hanoudi; Rebecca Tagett; Michele Donato; Călin Voichiţa; Sorin Drăghici

The goal of pathway analysis is to identify the pathways significantly impacted in a given phenotype. Many current methods are based on algorithms that consider pathways as simple gene lists, dramatically under-utilizing the knowledge that such pathways are meant to capture. During the past few years, a plethora of methods claiming to incorporate various aspects of the pathway topology have been proposed. These topology-based methods, sometimes referred to as “third generation,” have the potential to better model the phenomena described by pathways. Although there is now a large variety of approaches used for this purpose, no review is currently available to offer guidance for potential users and developers. This review covers 22 such topology-based pathway analysis methods published in the last decade. We compare these methods based on: type of pathways analyzed (e.g., signaling or metabolic), input (subset of genes, all genes, fold changes, gene p-values, etc.), mathematical models, pathway scoring approaches, output (one or more pathway scores, p-values, etc.) and implementation (web-based, standalone, etc.). We identify and discuss challenges, arising both in methodology and in pathway representation, including inconsistent terminology, different data formats, lack of meaningful benchmarks, and the lack of tissue and condition specificity.


Bioinformatics | 2011

The Biological Connection Markup Language

Luca Beltrame; Enrica Calura; Razvan R. Popovici; Lisa Rizzetto; Damariz Rivero Guedez; Michele Donato; Chiara Romualdi; Sorin Draghici; Duccio Cavalieri

Motivation: Many models and analysis of signaling pathways have been proposed. However, neither of them takes into account that a biological pathway is not a fixed system, but instead it depends on the organism, tissue and cell type as well as on physiological, pathological and experimental conditions. Results: The Biological Connection Markup Language (BCML) is a format to describe, annotate and visualize pathways. BCML is able to store multiple information, permitting a selective view of the pathway as it exists and/or behave in specific organisms, tissues and cells. Furthermore, BCML can be automatically converted into data formats suitable for analysis and into a fully SBGN-compliant graphical representation, making it an important tool that can be used by both computational biologists and ‘wet lab’ scientists. Availability and implementation: The XML schema and the BCML software suite are freely available under the LGPL for download at http://bcml.dc-atlas.net. They are implemented in Java and supported on MS Windows, Linux and OS X. Contact: [email protected]; [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


international conference on machine learning and applications | 2012

Incorporating Gene Significance in the Impact Analysis of Signaling Pathways

Calin Voichita; Michele Donato; Sorin Draghici

Identification of the most impacted signaling pathways in a given condition is a crucial step in understanding the underlying biological mechanism. An impact analysis that is able to take in consideration the structure of a given signaling pathway was proposed to measure the impact on each pathway given a list of differentially expressed (DE) genes and their fold changes. Here, we investigated the utility of incorporating the individual gene significance in the impact analysis of signaling pathways. We propose two alternative models to incorporate the individual gene p-values and compare their performance over a pool of 24 datasets. In addition, the two new models offer the ability to work with the entire set of gene expression measurements, thus eliminating the need to select differentially expressed genes.


PLOS ONE | 2016

Cross-Clustering: A Partial Clustering Algorithm with Automatic Estimation of the Number of Clusters.

Paola Tellaroli; Marco Bazzi; Michele Donato; Alessandra Rosalba Brazzale; Sorin Drăghici

Four of the most common limitations of the many available clustering methods are: i) the lack of a proper strategy to deal with outliers; ii) the need for a good a priori estimate of the number of clusters to obtain reasonable results; iii) the lack of a method able to detect when partitioning of a specific data set is not appropriate; and iv) the dependence of the result on the initialization. Here we propose Cross-clustering (CC), a partial clustering algorithm that overcomes these four limitations by combining the principles of two well established hierarchical clustering algorithms: Ward’s minimum variance and Complete-linkage. We validated CC by comparing it with a number of existing clustering methods, including Ward’s and Complete-linkage. We show on both simulated and real datasets, that CC performs better than the other methods in terms of: the identification of the correct number of clusters, the identification of outliers, and the determination of real cluster memberships. We used CC to cluster samples in order to identify disease subtypes, and on gene profiles, in order to determine groups of genes with the same behavior. Results obtained on a non-biological dataset show that the method is general enough to be successfully used in such diverse applications. The algorithm has been implemented in the statistical language R and is freely available from the CRAN contributed packages repository.


Bioinformatics | 2016

A novel bi-level meta-analysis approach: applied to biological pathway analysis.

Tin Nguyen; Rebecca Tagett; Michele Donato; Cristina Mitrea; Sorin Draghici

MOTIVATION The accumulation of high-throughput data in public repositories creates a pressing need for integrative analysis of multiple datasets from independent experiments. However, study heterogeneity, study bias, outliers and the lack of power of available methods present real challenge in integrating genomic data. One practical drawback of many P-value-based meta-analysis methods, including Fishers, Stouffers, minP and maxP, is that they are sensitive to outliers. Another drawback is that, because they perform just one statistical test for each individual experiment, they may not fully exploit the potentially large number of samples within each study. RESULTS We propose a novel bi-level meta-analysis approach that employs the additive method and the Central Limit Theorem within each individual experiment and also across multiple experiments. We prove that the bi-level framework is robust against bias, less sensitive to outliers than other methods, and more sensitive to small changes in signal. For comparative analysis, we demonstrate that the intra-experiment analysis has more power than the equivalent statistical test performed on a single large experiment. For pathway analysis, we compare the proposed framework versus classical meta-analysis approaches (Fishers, Stouffers and the additive method) as well as against a dedicated pathway meta-analysis package (MetaPath), using 1252 samples from 21 datasets related to three human diseases, acute myeloid leukemia (9 datasets), type II diabetes (5 datasets) and Alzheimers disease (7 datasets). Our framework outperforms its competitors to correctly identify pathways relevant to the phenotypes. The framework is sufficiently general to be applied to any type of statistical meta-analysis. AVAILABILITY AND IMPLEMENTATION The R scripts are available on demand from the authors. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Bioinformatics | 2018

A novel computational approach for drug repurposing using systems biology

Azam Peyvandipour; Nafiseh Saberian; Adib Shafi; Michele Donato; Sorin Draghici

Motivation Identification of novel therapeutic effects for existing US Food and Drug Administration (FDA)-approved drugs, drug repurposing, is an approach aimed to dramatically shorten the drug discovery process, which is costly, slow and risky. Several computational approaches use transcriptional data to find potential repurposing candidates. The main hypothesis of such approaches is that if gene expression signature of a particular drug is opposite to the gene expression signature of a disease, that drug may have a potential therapeutic effect on the disease. However, this may not be optimal since it fails to consider the different roles of genes and their dependencies at the system level. Results We propose a systems biology approach to discover novel therapeutic roles for established drugs that addresses some of the issues in the current approaches. To do so, we use publicly available drug and disease data to build a drug-disease network by considering all interactions between drug targets and disease-related genes in the context of all known signaling pathways. This network is integrated with gene-expression measurements to identify drugs with new desired therapeutic effects based on a system-level analysis method. We compare the proposed approach with the drug repurposing approach proposed by Sirota et al. on four human diseases: idiopathic pulmonary fibrosis, non-small cell lung cancer, prostate cancer and breast cancer. We evaluate the proposed approach based on its ability to re-discover drugs that are already FDA-approved for a given disease. Availability and implementation The R package DrugDiseaseNet is under review for publication in Bioconductor and is available at https://github.com/azampvd/DrugDiseaseNet. Supplementary information Supplementary data are available at Bioinformatics online.


congress on evolutionary computation | 2013

A genetic algorithms framework for estimating individual gene contributions in signaling pathways

Calin Voichita; Michele Donato; Sorin Draghici

With the rapid advancements in our data acquisition capabilities and the increased availability of gene interaction databases a variety of pathway analysis tools have been proposed. However, all these methods are dependent on the quality of the available pathways. These pathways were designed to describe the general mechanism of a particular disease or biological process. The known pathways encompass the results of many biological experiments and even though they represent our current understanding of those particular biological processes, they are still generally considered sketchy and incomplete. One piece of information that is generally missing regards the role or importance of a gene in a given pathway which we refer to as the gene contribution. We propose here a method, based on genetic algorithms, to objectively quantify the contribution of each gene. Using a pool of 24 data sets from 12 different conditions divided in train and test groups, we show how an impact pathway analysis method achieves significantly better results with the newly estimated gene contributions when compared with both the initial default contributions, as well as randomly selected gene contributions.


BioSystems | 2013

Assessing co-regulation of directly linked genes in biological networks using microarray time series analysis.

Maria Rosaria Del Sorbo; Walter Balzano; Michele Donato; Sorin Draghici

Differential expression of genes detected with the analysis of high throughput genomic experiments is a commonly used intermediate step for the identification of signaling pathways involved in the response to different biological conditions. The impact analysis was the first approach for the analysis of signaling pathways involved in a certain biological process that was able to take into account not only the magnitude of the expression change of the genes but also the topology of signaling pathways including the type of each interactions between the genes. In the impact analysis, signaling pathways are represented as weighted directed graphs with genes as nodes and the interactions between genes as edges. Edges weights are represented by a β factor, the regulatory efficiency, which is assumed to be equal to 1 in inductive interactions between genes and equal to -1 in repressive interactions. This study presents a similarity analysis between gene expression time series aimed to find correspondences with the regulatory efficiency, i.e. the β factor as found in a widely used pathway database. Here, we focused on correlations among genes directly connected in signaling pathways, assuming that the expression variations of upstream genes impact immediately downstream genes in a short time interval and without significant influences by the interactions with other genes. Time series were processed using three different similarity metrics. The first metric is based on the bit string matching; the second one is a specific application of the Dynamic Time Warping to detect similarities even in presence of stretching and delays; the third one is a quantitative comparative analysis resulting by an evaluation of frequency domain representation of time series: the similarity metric is the correlation between dominant spectral components. These three approaches are tested on real data and pathways, and a comparison is performed using Information Retrieval benchmark tools, indicating the frequency approach as the best similarity metric among the three, for its ability to detect the correlation based on the correspondence of the most significant frequency components.


international symposium on neural networks | 2010

Signaling pathways coupling phenomena

Michele Donato; Sorin Draghici

Two radically different approaches are currently available to identify the signaling pathways that are significantly impacted in a given condition: enrichment analysis and impact analysis. These approaches calculate a p-value that aims to quantify the significance of the involvement of the given pathway in the condition under study. These p-values were thought to be inversely proportional to the likelihood of their respective pathways being involved in the given condition, and hence be independent. Here we show that various pathways can affect each others p-values in significant ways. Thus, the significance of a given pathway in a given experiment has to be interpreted in the context of the other pathways that appear to be significant. In certain circumstances, pathways previously found to be significant with some of the existing methods may not be so. We hypothesize that the phenomenon is related to the amount of common genes between different pathways. Here we present results obtained by analyzing pathways obtained from the KEGG signaling pathways database. However, the same phenomenon is expected to influence the analysis of any pathways from any pathway repository.


Proceedings of the IEEE | 2017

A Novel Pathway Analysis Approach Based on the Unexplained Disregulation of Genes

Calin Voichita; Michele Donato; Rebecca Tagett; Sorin Draghici

A crucial step in the understanding of any phenotype is the correct identification of the signaling pathways that are significantly impacted in that phenotype. However, most current pathway analysis methods produce both false positives as well as false negatives in certain circumstances. We hypothesized that such incorrect results are due to the fact that the existing methods fail to distinguish between the primary dis-regulation of a given gene itself and the effects of signaling coming from upstream. Furthermore, a modern whole-genome experiment performed with a next-generation technology spends a great deal of effort to measure the entire set of 30000–100000 transcripts in the genome. This is followed by the selection of a few hundreds differentially expressed genes, step that literally discards more than 99% of the collected data. We also hypothesized that such a drastic filtering could discard many genes that play crucial roles in the phenotype. We propose a novel topology-based pathway analysis method that identifies significantly impacted pathways using the entire set of measurements, thus allowing the full use of the data provided by NGS techniques. The results obtained on 24 real data sets involving 12 different human diseases, as well as on 8 yeast knock-out data sets show that the proposed method yields significant improvements with respect to the state-of-the-art methods: SPIA, GSEA, and GSA. Availability: Primary dis-regulation analysis is implemented in R and included in ROntoTools Bioconductor package (versions

Collaboration


Dive into the Michele Donato's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Roberto Romero

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Adib Shafi

Wayne State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Diana Diaz

Wayne State University

View shared research outputs
Researchain Logo
Decentralizing Knowledge