Nicola Torelli | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nicola Torelli is active.

Explore More

Publication

Featured researches published by Nicola Torelli.

Statistics and Computing | 2007

Clustering via nonparametric density estimation

Adelchi Azzalini; Nicola Torelli

Although Hartigan (1975) had already put forward the idea of connecting identification of subpopulations with regions with high density of the underlying probability distribution, the actual development of methods for cluster analysis has largely shifted towards other directions, for computational convenience. Current computational resources allow us to reconsider this formulation and to develop clustering techniques directly in order to identify local modes of the density. Given a set of observations, a nonparametric estimate of the underlying density function is constructed, and subsets of points with high density are formed through suitable manipulation of the associated Delaunay triangulation. The method is illustrated with some numerical examples.

Data Mining and Knowledge Discovery | 2014

Training and assessing classification rules with imbalanced data

Giovanna Menardi; Nicola Torelli

The problem of modeling binary responses by using cross-sectional data has been addressed with a number of satisfying solutions that draw on both parametric and nonparametric methods. However, there exist many real situations where one of the two responses (usually the most interesting for the analysis) is rare. It has been largely reported that this class imbalance heavily compromises the process of learning, because the model tends to focus on the prevalent class and to ignore the rare events. However, not only the estimation of the classification model is affected by a skewed distribution of the classes, but also the evaluation of its accuracy is jeopardized, because the scarcity of data leads to poor estimates of the model’s accuracy. In this work, the effects of class imbalance on model training and model assessing are discussed. Moreover, a unified and systematic framework for dealing with the problem of imbalanced classification is proposed, based on a smoothed bootstrap re-sampling technique. The proposed technique is founded on a sound theoretical basis and an extensive empirical study shows that it outperforms the main other remedies to face imbalanced learning problems.

Advanced Data Analysis and Classification | 2014

Clustering of financial time series in risky scenarios

Fabrizio Durante; Roberta Pappadà; Nicola Torelli

A methodology is presented for clustering financial time series according to the association in the tail of their distribution. The procedure is based on the calculation of suitable pairwise conditional Spearman’s correlation coefficients extracted from the series. The performance of the method has been tested via a simulation study. As an illustration, an analysis of the components of the Italian FTSE–MIB is presented. The results could be applied to construct financial portfolios that can manage to reduce the risk in case of simultaneous large losses in several markets.

Archive | 2010

Preserving the Clustering Structure by a Projection Pursuit Approach

Giovanna Menardi; Nicola Torelli

A projection pursuit technique to reduce the dimensionality of a data set preserving the clustering structure is proposed. It is based on Silverman’s (J R Stat Soc B 43:97–99, 1981) critical bandwidth. We show that critical bandwidth is scale equivariant and this property allows us to keep affine invariance of the projection pursuit solution.

Archive | 2005

Selecting the Training Set in Classification Problems with Rare Events

Bruno Scarpa; Nicola Torelli

Binary classification algorithms are often used in situations when one of the two classes is extremely rare. A common practice is to oversample units of the rare class when forming the training set. For some classification algorithms, like logistic classification, there are theoretical results that justify such an approach. Similar results are not available for other popular classification algorithms like classification trees. In this paper the use of balanced datasets, when dealing with rare classes, for tree classifiers and boosting algorithms is discussed and results from analyzing a real dataset and a simulated dataset are reported.

Journal of Statistical Computation and Simulation | 2013

Reducing data dimension for cluster detection

Giovanna Menardi; Nicola Torelli

Clustering high-dimensional data is often a challenging task both because of the computational burden required to run any technique, and because the difficulty in interpreting clusters generally increases with the data dimension. In this work, a method for finding low-dimensional representations of high-dimensional data is discussed, specifically conceived to preserve possible clusters in data. It is based on the critical bandwidth, a nonparametric statistic to test unimodality, related to kernel density estimation. Some useful properties of the aforementioned statistic are enlightened and an adjustment to use it as a basis for reducing dimensionality is suggested. The method is illustrated by simulated and real data examples.

Archive | 2018

A Graphical Tool for Copula Selection Based on Tail Dependence

Roberta Pappadà; Fabrizio Durante; Nicola Torelli

In many practical applications, the selection of copulas with a specific tail behaviour may allow to estimate properly the region of the distribution that is needed at most, especially in risk management procedures. Here, a graphical tool is presented in order to assist the decision maker in the selection of an appropriate model for the problem at hand. Such a tool provides valuable indications for a preliminary overview of the tail features of different copulas which may help in the choice of a parametric model. Its use is illustrated under various dependency scenarios.

Journal of The Royal Statistical Society Series A-statistics in Society | 2018

Bayesian semiparametric modelling of contraceptive behaviour in India via sequential logistic regressions

Tommaso Rigon; Daniele Durante; Nicola Torelli

Family planning has been characterized by highly different strategic programmes in India, including method‐specific contraceptive targets, coercive sterilization and more recent target‐free approaches. These major changes in family planning policies over time have motivated considerable interest towards assessing the effectiveness of the different planning programmes. Current studies mainly focus on the factors driving the choice among specific subsets of contraceptives, such as a preference for alternative methods other than sterilization. Although this restricted focus produces key insights, it fails to provide a global overview of the different policies, and of the determinants underlying the choices from the entire range of contraceptive methods. Motivated by this consideration, we propose a Bayesian semiparametric model relying on a reparameterization of the multinomial probability mass function via a set of conditional Bernoulli choices. This binary decision tree is defined to be consistent with the current family planning policies in India, and coherent with a reasonable process characterizing the choice between increasingly nested subsets of contraceptive methods. The model allows a subset of covariates to enter the predictor via Bayesian penalized splines and exploits mixture models to represent uncertainty in the distribution of the state‐specific random effects flexibly. This combination of flexible and careful reparameterizations allows a broader and interpretable overview of the policies and contraceptive preferences in India.

arXiv: Computation | 2016

Maxima Units Search (MUS) Algorithm: Methodology and Applications

Leonardo Egidi; Roberta Pappadà; Francesco Pauli; Nicola Torelli

An algorithm for extracting identity submatrices of small rank and pivotal units from large and sparse matrices is proposed. The procedure has already been satisfactorily applied for solving the label switching problem in Bayesian mixture models. Here we introduce it on its own and explore possible applications in different contexts.

Archive | 2011

On the Use of Boosting Procedures to Predict the Risk of Default

Giovanna Menardi; Federico Tedeschi; Nicola Torelli

Statistical models have been widely applied with the aim of evaluating the risk of default of enterprises. However, a typical problem is that the occurrence of the default event is rare, and this class imbalance strongly affects the performance of traditional classifiers. Boosting is a general class of methods which iteratively enforces the accuracy of any weak learner, but it suffers from some drawbacks in presence of unbalanced classes. Performance of standard boosting procedures to deal with unbalanced classes is discussed and a new algorithm is proposed.

Explore More