Francesco Sambo
University of Padua
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Francesco Sambo.
Journal of The American Society of Nephrology | 2017
Niina Sandholm; Natalie Van Zuydam; Emma Ahlqvist; Thorhildur Juliusdottir; Harshal Deshmukh; N. William Rayner; Barbara Di Camillo; Carol Forsblom; João Fadista; Daniel Ziemek; Rany M. Salem; Linda T. Hiraki; Marcus G. Pezzolesi; David Tregouet; Emma Dahlström; Erkka Valo; Nikolay Oskolkov; Claes Ladenvall; M. Loredana Marcovecchio; Jason D. Cooper; Francesco Sambo; Alberto Malovini; Marco Manfrini; Amy Jayne McKnight; Maria Lajer; Valma Harjutsalo; Daniel Gordin; Maija Parkkonen; Valeriya Lyssenko; Paul McKeigue
Diabetes is the leading cause of ESRD. Despite evidence for a substantial heritability of diabetic kidney disease, efforts to identify genetic susceptibility variants have had limited success. We extended previous efforts in three dimensions, examining a more comprehensive set of genetic variants in larger numbers of subjects with type 1 diabetes characterized for a wider range of cross-sectional diabetic kidney disease phenotypes. In 2843 subjects, we estimated that the heritability of diabetic kidney disease was 35% (P=6.4×10-3). Genome-wide association analysis and replication in 12,540 individuals identified no single variants reaching stringent levels of significance and, despite excellent power, provided little independent confirmation of previously published associated variants. Whole-exome sequencing in 997 subjects failed to identify any large-effect coding alleles of lower frequency influencing the risk of diabetic kidney disease. However, sets of alleles increasing body mass index (P=2.2×10-5) and the risk of type 2 diabetes (P=6.1×10-4) associated with the risk of diabetic kidney disease. We also found genome-wide genetic correlation between diabetic kidney disease and failure at smoking cessation (P=1.1×10-4). Pathway analysis implicated ascorbate and aldarate metabolism (P=9.0×10-6), and pentose and glucuronate interconversions (P=3.0×10-6) in pathogenesis of diabetic kidney disease. These data provide further evidence for the role of genetic factors influencing diabetic kidney disease in those with type 1 diabetes and highlight some key pathways that may be responsible. Altogether these results reveal important biology behind the major cause of kidney disease.
PLOS ONE | 2012
Barbara Di Camillo; Tiziana Sanavia; Matteo Martini; Giuseppe Jurman; Francesco Sambo; Annalisa Barla; Cesare Furlanello; Gianna Toffolo; Claudio Cobelli
Motivation The identification of robust lists of molecular biomarkers related to a disease is a fundamental step for early diagnosis and treatment. However, methodologies for the discovery of biomarkers using microarray data often provide results with limited overlap. These differences are imputable to 1) dataset size (few subjects with respect to the number of features); 2) heterogeneity of the disease; 3) heterogeneity of experimental protocols and computational pipelines employed in the analysis. In this paper, we focus on the first two issues and assess, both on simulated (through an in silico regulation network model) and real clinical datasets, the consistency of candidate biomarkers provided by a number of different methods. Methods We extensively simulated the effect of heterogeneity characteristic of complex diseases on different sets of microarray data. Heterogeneity was reproduced by simulating both intrinsic variability of the population and the alteration of regulatory mechanisms. Population variability was simulated by modeling evolution of a pool of subjects; then, a subset of them underwent alterations in regulatory mechanisms so as to mimic the disease state. Results The simulated data allowed us to outline advantages and drawbacks of different methods across multiple studies and varying number of samples and to evaluate precision of feature selection on a benchmark with known biomarkers. Although comparable classification accuracy was reached by different methods, the use of external cross-validation loops is helpful in finding features with a higher degree of precision and stability. Application to real data confirmed these results.
BMC Bioinformatics | 2012
Francesco Sambo; Emanuele Trifoglio; Barbara Di Camillo; Gianna Toffolo; Claudio Cobelli
BackgroundMultifactorial diseases arise from complex patterns of interaction between a set of genetic traits and the environment. To fully capture the genetic biomarkers that jointly explain the heritability component of a disease, thus, all SNPs from a genome-wide association study should be analyzed simultaneously.ResultsIn this paper, we present Bag of Naïve Bayes (BoNB), an algorithm for genetic biomarker selection and subjects classification from the simultaneous analysis of genome-wide SNP data. BoNB is based on the Naïve Bayes classification framework, enriched by three main features: bootstrap aggregating of an ensemble of Naïve Bayes classifiers, a novel strategy for ranking and selecting the attributes used by each classifier in the ensemble and a permutation-based procedure for selecting significant biomarkers, based on their marginal utility in the classification process. BoNB is tested on the Wellcome Trust Case-Control study on Type 1 Diabetes and its performance is compared with the ones of both a standard Naïve Bayes algorithm and HyperLASSO, a penalized logistic regression algorithm from the state-of-the-art in simultaneous genome-wide data analysis.ConclusionsThe significantly higher classification accuracy obtained by BoNB, together with the significance of the biomarkers identified from the Type 1 Diabetes dataset, prove the effectiveness of BoNB as an algorithm for both classification and biomarker selection from genome-wide SNP data.AvailabilitySource code of the BoNB algorithm is released under the GNU General Public Licence and is available at http://www.dei.unipd.it/~sambofra/bonb.html.
Computational Statistics & Data Analysis | 2014
Francesco Sambo; Matteo Borrotti; Kalliopi Mylona
Many industrial experiments involve one or more restrictions on the randomization. In such cases, the split-plot design structure, in which the experimental runs are performed in groups, is a commonly used cost-efficient approach that reduces the number of independent settings of the hard-to-change factors. Several criteria can be adopted for optimizing split-plot experimental designs: the most frequently used are D-optimality and I-optimality. A multi-objective approach to the optimal design of split-plot experiments, the coordinate-exchange two-phase local search (CE-TPLS), is proposed. The CE-TPLS algorithm is able to approximate the set of experimental designs which concurrently minimize the D-criterion and the I-criterion. It allows for a flexible choice of the number of hard-to-change factors, the number of easy-to-change factors, the number of whole plots and the total sample size. When tested on four case studies from the literature, the proposed algorithm returns meaningful sets of experimental designs, covering the whole spectrum between the two objectives. On most of the analyzed cases, the CE-TPLS algorithm returns better results than those reported in the original papers and outperforms the state-of-the-art algorithm in terms of computational time, while retaining a comparable performance in terms of the quality of the optima for each single objective.
Journal of diabetes science and technology | 2016
Chiara Fabris; Andrea Facchinetti; Giuseppe Fico; Francesco Sambo; María Teresa Arredondo; Claudio Cobelli
Background: Abnormal glucose variability (GV) is a risk factor for diabetes complications, and tens of indices for its quantification from continuous glucose monitoring (CGM) time series have been proposed. However, the information carried by these indices is redundant, and a parsimonious description of GV can be obtained through sparse principal component analysis (SPCA). We have recently shown that a set of 10 metrics selected by SPCA is able to describe more than 60% of the variance of 25 GV indicators in type 1 diabetes (T1D). Here, we want to extend the application of SPCA to type 2 diabetes (T2D). Methods: A data set of CGM time series collected in 13 T2D subjects was considered. The 25 GV indices considered for T1D were evaluated. SPCA was used to select a subset of indices able to describe the majority of the original variance. Results: A subset of 10 indicators was selected and allowed to describe 83% of the variance of the original pool of 25 indices. Four metrics sufficient to describe 67% of the original variance turned out to be shared by the parsimonious sets of indices in T1D and T2D. Conclusions: Starting from a pool of 25 indices assessed from CGM time series in T2D subjects, reduced subsets of metrics virtually providing the same information content can be determined by SPCA. The fact that these indices also appear in the parsimonious description of GV in T1D may indicate that they could be particularly informative of GV in diabetes, regardless of the specific type of disease.
IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2012
Francesco Sambo; Marco Antonio Montes de Oca; Barbara Di Camillo; Gianna Toffolo; Thomas Stützle
Reverse engineering is the problem of inferring the structure of a network of interactions between biological variables from a set of observations. In this paper, we propose an optimization algorithm, called MORE, for the reverse engineering of biological networks from time series data. The model inferred by MORE is a sparse system of nonlinear differential equations, complex enough to realistically describe the dynamics of a biological system. MORE tackles separately the discrete component of the problem, the determination of the biological network topology, and the continuous component of the problem, the strength of the interactions. This approach allows us both to enforce system sparsity, by globally constraining the number of edges, and to integrate a priori information about the structure of the underlying interaction network. Experimental results on simulated and real-world networks show that the mixed discrete/continuous optimization approach of MORE significantly outperforms standard continuous optimization and that MORE is competitive with the state of the art in terms of accuracy of the inferred networks.
Journal of Biomedical Informatics | 2015
Simone Marini; Emanuele Trifoglio; Nicola Barbarini; Francesco Sambo; Barbara Di Camillo; Alberto Malovini; Marco Manfrini; Claudio Cobelli; Riccardo Bellazzi
The increasing prevalence of diabetes and its related complications is raising the need for effective methods to predict patient evolution and for stratifying cohorts in terms of risk of developing diabetes-related complications. In this paper, we present a novel approach to the simulation of a type 1 diabetes population, based on Dynamic Bayesian Networks, which combines literature knowledge with data mining of a rich longitudinal cohort of type 1 diabetes patients, the DCCT/EDIC study. In particular, in our approach we simulate the patient health state and complications through discretized variables. Two types of models are presented, one entirely learned from the data and the other partially driven by literature derived knowledge. The whole cohort is simulated for fifteen years, and the simulation error (i.e. for each variable, the percentage of patients predicted in the wrong state) is calculated every year on independent test data. For each variable, the population predicted in the wrong state is below 10% on both models over time. Furthermore, the distributions of real vs. simulated patients greatly overlap. Thus, the proposed models are viable tools to support decision making in type 1 diabetes.
graph structures for knowledge representation and reasoning | 2015
Magdalena Ivanovska; Audun Jøsang; Lance M. Kaplan; Francesco Sambo
Subjective logic is a formalism for reasoning under uncertain probabilistic information, with an explicit treatment of the uncertainty about the probability distributions. We introduce subjective networks as graph-based structures that generalize Bayesian networks to the theory of subjective logic. We discuss the perspectives of the subjective networks representation and the challenges of reasoning with them.
Modelling Methodology for Physiology and Medicine (Second Edition) | 2014
Francesco Sambo; Fulvia Ferrazzi; Riccardo Bellazzi
This chapter introduces a probabilistic approach to modelling in physiology and medicine: the quantities of interest are modeled as random variables and the focus is on the probabilistic dependencies between these variables. As primary tool in this modelling framework, we present Bayesian networks (BNs), which map the dependencies between a set of random variables to a directed acyclic graph, both increasing human readability and simplifying the representation of the joint probability distribution of the set of variables. The chapter first describes the theoretical foundations of BNs, including a brief review of probability and graph theory, a formal definition of BNs and details on discrete, continuous, and dynamic BNs. Then, a selection of algorithms for inference, conditional probability learning, and structure learning is presented. Finally, several examples of BN applications in biomedicine are reviewed.
Diabetologia | 2014
Francesco Sambo; Alberto Malovini; Niina Sandholm; Monica Stavarachi; Carol Forsblom; Ville Petteri Mäkinen; Valma Harjutsalo; Raija Lithovius; Daniel Gordin; Maija Parkkonen; Markku Saraheimo; Lena M. Thorn; Nina Tolonen; Johan Wadén; Bing He; Anne May Österholm; J. Tuomilehto; Maria Lajer; Rany M. Salem; Amy Jayne McKnight; Lise Tarnow; Nicolae Mircea Panduru; Nicola Barbarini; Barbara Di Camillo; Gianna Toffolo; Karl Tryggvason; Riccardo Bellazzi; Claudio Cobelli; Per-Henrik Groop
Aims/hypothesisDiabetic nephropathy is a major diabetic complication, and diabetes is the leading cause of end-stage renal disease (ESRD). Family studies suggest a hereditary component for diabetic nephropathy. However, only a few genes have been associated with diabetic nephropathy or ESRD in diabetic patients. Our aim was to detect novel genetic variants associated with diabetic nephropathy and ESRD.MethodsWe exploited a novel algorithm, ‘Bag of Naive Bayes’, whose marker selection strategy is complementary to that of conventional genome-wide association models based on univariate association tests. The analysis was performed on a genome-wide association study of 3,464 patients with type 1 diabetes from the Finnish Diabetic Nephropathy (FinnDiane) Study and subsequently replicated with 4,263 type 1 diabetes patients from the Steno Diabetes Centre, the All Ireland-Warren 3-Genetics of Kidneys in Diabetes UK collection (UK–Republic of Ireland) and the Genetics of Kidneys in Diabetes US Study (GoKinD US).ResultsFive genetic loci (WNT4/ZBTB40-rs12137135, RGMA/MCTP2-rs17709344, MAPRE1P2-rs1670754, SEMA6D/SLC24A5-rs12917114 and SIK1-rs2838302) were associated with ESRD in the FinnDiane study. An association between ESRD and rs17709344, tagging the previously identified rs12437854 and located between the RGMA and MCTP2 genes, was replicated in independent case–control cohorts. rs12917114 near SEMA6D was associated with ESRD in the replication cohorts under the genotypic model (p < 0.05), and rs12137135 upstream of WNT4 was associated with ESRD in Steno.Conclusions/interpretationThis study supports the previously identified findings on the RGMA/MCTP2 region and suggests novel susceptibility loci for ESRD. This highlights the importance of applying complementary statistical methods to detect novel genetic variants in diabetic nephropathy and, in general, in complex diseases.