Roberta Siciliano | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Roberta Siciliano is active.

Explore More

Publication

Featured researches published by Roberta Siciliano.

Journal of Classification | 2009

Incremental Tree-Based Missing Data Imputation with Lexicographic Ordering

Claudio Conversano; Roberta Siciliano

In the framework of incomplete data analysis, this paper provides a nonparametric approach to missing data imputation based on Information Retrieval. In particular, an incremental procedure based on the iterative use of tree-based method is proposed and a suitable Incremental Imputation Algorithm is introduced. The key idea is to define a lexicographic ordering of cases and variables so that conditional mean imputation via binary trees can be performed incrementally. A simulation study and real data applications are carried out to describe the advantages and the performance with respect to standard approaches.

intelligent data analysis | 2007

Robust tree-based incremental imputation method for data fusion

Antonio D'Ambrosio; Massimo Aria; Roberta Siciliano

Data Fusion and Data Grafting are concerned with combining files and information coming from different sources. The problem is not to extract data from a single database, but to merge information collected from different sample surveys. The typical data fusion situation formed of two data samples, the former made up of a complete data matrix X relative to a first survey, and the latter Y which contains a certain number of missing variables. The aim is to complete the matrix Y beginning from the knowledge acquired from the X. Thus, the goal is the definition of the correlation structure which joins the two data matrices to be merged. In this paper, we provide an innovative methodology for Data Fusion based on an incremental imputation algorithm in tree-based models. In addition, we consider robust tree validation by boosting iterations. A relevant advantage of the proposed method is that it works for a mixed data structure including both numerical and categorical variables. As benchmarking methods we consider explicit methods such as standard trees and multiple regression as well as an implicit method based principal component analysis. A widely extended simulation study proves that the proposed method is more accurate than the other methods.

European Journal of Operational Research | 2016

Accurate algorithms for identifying the median ranking when dealing with weak and partial rankings under the Kemeny axiomatic approach

Sonia Amodio; Antonio D’Ambrosio; Roberta Siciliano

Preference rankings virtually appear in all fields of science (political sciences, behavioral sciences, machine learning, decision making and so on). The well-known social choice problem consists in trying to find a reasonable procedure to use the aggregate preferences or rankings expressed by subjects to reach a collective decision. This turns out to be equivalent to estimate the consensus (central) ranking from data and it is known to be a NP-hard problem. A useful solution has been proposed by Emond and Mason in 2002 through the Branch-and-Bound algorithm (BB) within the Kemeny and Snell axiomatic framework. As a matter of fact, BB is a time demanding procedure when the complexity of the problem becomes untractable, i.e. a large number of objects, with weak and partial rankings, in presence of a low degree of consensus. As an alternative, we propose an accurate heuristic algorithm called FAST that finds at least one of the consensus ranking solutions found by BB saving a lot of computational time. In addition, we show that the building block of FAST is an algorithm called QUICK that finds already one of the BB solutions so that it can be fruitfully considered to speed up even more the overall searching procedure if the number of objects is low. Simulation studies and applications on real data allows to show the accuracy and the computational efficiency of our proposal.

Computational Statistics & Data Analysis | 2002

Generalized additive multi-mixture model for data mining

Claudio Conversano; Roberta Siciliano; Francesco Mola

The main idea of this paper is to make statistical modelling into a feasible and valuable approach to data mining. The class of generalized additive multi-models (GAM-M) is considered in the framework of non-linear regression methods and data mining. GAM-M are based on a combined model integration approach that aims to associate estimations derived from smoothing functions as well as by either parametric or non-parametric models. We extend this approach to provide a class of models based on a mixture model combination. Bootstrap averaging and model fit scoring are exploited in order to prevent overfitting as well as to improve the prediction accuracy of the GAM-M models. The benchmarking of the proposed methodology is shown using a simulated data set.

Expert Systems With Applications | 2016

Parsimonious time series clustering using P-splines

Carmela Iorio; Gianluca Frasso; Antonio D'Ambrosio; Roberta Siciliano

A new parsimonious way to cluster time (data) series is provided.We deal with P-spline framework and non-hierarchical clustering.Simulation studies and two well-known real world case studies are performed. We introduce a parsimonious model-based framework for clustering time course data. In these applications the computational burden becomes often an issue due to the large number of available observations. The measured time series can also be very noisy and sparse and an appropriate model describing them can be hard to define. We propose to model the observed measurements by using P-spline smoothers and then to cluster the functional objects as summarized by the optimal spline coefficients. According to the characteristics of the observed measurements, our proposal can be combined with any suitable clustering method. In this paper we provide applications based on non-hierarchical clustering algorithms. We evaluate the accuracy and the efficiency of our proposal by simulations and by analyzing two real data examples.

Proceedings In Computational Statistics | 1998

An Alternative Pruning Method Based on the Impurity-Complexity Measure

Carmela Cappelli; Francesco Mola; Roberta Siciliano

This paper provides a new pruning method for classification trees based on the impurity-complexity measure. Advantages of the proposed approach compared to the error-complexity pruning method are outlined showing an example on a real data set.

Oncotarget | 2016

Somatostatin Analogues according to Ki67 index in neuroendocrine tumours: an observational retrospective- prospective analysis from real life

Antongiulio Faggiano; Anna Chiara Carratu; Elia Guadagno; Salvatore Tafuto; Fabiana Tatangelo; Ferdinando Riccardi; Carmela Mocerino; Giovannella Palmieri; Vincenzo Damiano; Roberta Siciliano; Silvana Leo; Annamaria Mauro; Lucia Tozzi; Claudia Battista; Gaetano De Rosa; Annamaria Colao

Somatostatin analogues (SSAs) have shown limited and variable antiproliferative effects in neuroendocrine tumours (NETs). Whether tumour control by SSAs depends on grading based on the 2010 WHO NET classification is still unclear. The aim of this study is to evaluate the efficacy of long-acting SSAs in NETs according to Ki67 index. An observational Italian multicentre study was designed to collect data in patients with gastro-entero-pancreatic or thoracic NETs under SSA treatment. Both retrospective and prospective data were included and they were analysed in line with Ki67 index, immunohistochemically evaluated in tumour samples and graded according to WHO classification (G1 = Ki67 index 0-2%, G2 = Ki67 index 3-20%, G3 = Ki67 index > 20%). Among 601 patients with NET, 140 with a histologically confirmed gastro-entero-pancreatic or thoracic NET or NET with unknown primary were treated with lanreotide autogel or octreotide LAR. An objective tumour response was observed in 11%, stability in 58% and progression in 31%. Objective response and tumour stability were not significantly different between G1 and G2 NETs. Progression free survival was longer but not significantly different in G1 than G2 NETs (median: 89 vs 43 months, p = 0.15). The median PFS was significantly longer in NETs showing Ki67 < 5% than in those showing Ki67 ≥5% (89 vs 35 months, p = 0.005). SSA therapy shows significant antiproliferative effects in well differentiated low/intermediate-proliferating NETs, not only G1 but also in G2 type. A Ki67 index of 5% seems to work better than 3% to select the best candidates for SSA therapy.

Expert Systems With Applications | 2017

Regression trees for multivalued numerical response variables

Antonio D’Ambrosio; Massimo Aria; Carmela Iorio; Roberta Siciliano

Abstract In the framework of regression trees, this paper provides a recursive partitioning methodology to deal with a non-standard response variable. Specifically, either multivalued numerical or modal response of the type histogram will be considered. These data are known as symbolic data, which special cases are classical data, imprecise data, conjunctive data as well as fuzzy data. In spite of pre-processing data in order to deal with standard regression tree methodology, this paper provides, as main contribution, a definition of the impurity measure and of the splitting criterion allowing for building the regression tree for multivalued numerical response variable. We analyze and evaluate the performance of our proposal, using simulated data as well as a real-world case studies.

Computers & Operations Research | 2017

A differential evolution algorithm for finding the median ranking under the Kemeny axiomatic approach

Antonio D'Ambrosio; Giulio Mazzeo; Carmela Iorio; Roberta Siciliano

An accurate (meta)heuristic solution to the rank aggregation problem is proposed.The reference paradigm is the KemenySnell axiomatic framework.We specifically adapt the differential evolution algorithm to deal with the median ranking problem.Simulation studies and real data applications are performed. In recent years the analysis of preference rankings has become an increasingly important topic. One of the most important tasks in dealing with preference rankings is the identification of the median ranking, namely that ranking that best represents the preferences of a population of judges. This task is known with several alternative names, such as rank aggregation problem, consensus ranking problem, social choice problem. In this paper we propose a Differential Evolution algorithm for the Consensus Ranking detection (DECoR) within the Kemenys axiomatic framework. The algorithm works with full, partial and incomplete rankings. A simulation study shows that our proposal is particularly feasible when working with a very large number of objects to be ranked, because it is accurate and also faster than other proposals. Some applications on real data sets show the practical utility of our proposal in helping the users in taking decisions.

Expert Systems With Applications | 2018

A P-Spline based clustering approach for portfolio selection

Carmela Iorio; Gianluca Frasso; Antonio D’Ambrosio; Roberta Siciliano

Abstract In the last years, many clustering techniques dealing with time course data have been proposed due to recent interests in studying phenomena that change over time. A new clustering method suitable for time series applications has been recently proposed by exploiting the properties of the P-splines approach. This semi-parametric tool has several advantages, i.e. it facilitates the removal of noise from time series and it ensures a computational time saving. In this paper, we propose to use this clustering approach on financial data with the aim of building a financial portfolio. Our proposal works directly on time series without any pre-processing, except for the computation of the spline coefficients and, eventually, normalizing the series. We show that our strategy is useful to support the investment decisions of financial practitioners.

Explore More