Domenico Perrotta
European Commission
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Domenico Perrotta.
Electronic Journal of Statistics | 2014
Marco Riani; Andrea Cerioli; Anthony C. Atkinson; Domenico Perrotta
Robust methods are little applied (although much studied by statisticians). We monitor very robust regression by looking at the be- haviour of residuals and test statistics as we smoothly change the robustness of parameter estimation from a breakdown point of 50% to non-robust least squares. The resulting procedure provides insight into the structure of the data including outliers and the presence of more than one population. Moni- toring overcomes the hindrances to the routine adoption of robust methods, being informative about the choice between the various robust procedures. Methods tuned to give nominal high efficiency fail with our most compli- cated example. We find that the most informative analyses come from S estimates combined with Tukeys biweight or with the optimalfunctions. For our major example with 1,949 observations and 13 explanatory vari- ables, we combine robust S estimation with regression using the forward search, so obtaining an understanding of the importance of individual obser- vations, which is missing from standard robust procedures. We discover that the data come from two different populations. They also contain six outliers. Our analyses are accompanied by numerous graphs. Algebraic results are contained in two appendices, the second of which provides useful new results on the absolute odd moments of elliptically truncated multivariate normal random variables.
Statistical Science | 2014
Marco Riani; Anthony C. Atkinson; Domenico Perrotta
There are several methods for obtaining very robust estimates of regression parameters that asymptotically resist 50% of outliers in the data. Differences in the behaviour of these algorithms depend on the distance between the regression data and the outliers. We introduce a parameter
Advanced Data Analysis and Classification | 2014
Andrea Cerioli; Domenico Perrotta
\lambda
Computational Statistics & Data Analysis | 2012
Francesca Torti; Domenico Perrotta; Anthony C. Atkinson; Marco Riani
that defines a parametric path in the space of models and enables us to study, in a systematic way, the properties of estimators as the groups of data move from being far apart to close together. We examine, as a function of
Advanced Data Analysis and Classification | 2009
Domenico Perrotta; Marco Riani; Francesca Torti
\lambda
Archive | 2010
Domenico Perrotta; Francesca Torti
, the variance and squared bias of five estimators and we also consider their power when used in the detection of outliers. This systematic approach provides tools for gaining knowledge and better understanding of the properties of robust estimators.
soft methods in probability and statistics | 2013
Marco Turchi; Domenico Perrotta; Marco Riani; Andrea Cerioli
Robust methods are needed to fit regression lines when outliers are present. In a clustering framework, outliers can be extreme observations, high leverage points, but also data points which lie among the groups. Outliers are also of paramount importance in the analysis of international trade data, which motivate our work, because they may provide information about anomalies like fraudulent transactions. In this paper we show that robust techniques can fail when a large proportion of non-contaminated observations fall in a small region, which is a likely occurrence in many international trade data sets. In such instances, the effect of a high-density region is so strong that it can override the benefits of trimming and other robust devices. We propose to solve the problem by sampling a much smaller subset of observations which preserves the cluster structure and retains the main outliers of the original data set. This goal is achieved by defining the retention probability of each point as an inverse function of the estimated density function for the whole data set. We motivate our proposal as a thinning operation on a point pattern generated by different components. We then apply robust clustering methods to the thinned data set for the purposes of classification and outlier detection. We show the advantages of our method both in empirical applications to international trade examples and through a simulation study.
Classification and Data Mining | 2013
Jukka Heikkonen; Domenico Perrotta; Marco Riani; Francesca Torti
The methods of very robust regression resist up to 50% of outliers. The algorithms for very robust regression rely on selecting numerous subsamples of the data. New algorithms for LMS and LTS estimators that have increased computational efficiency due to improved combinatorial sampling are proposed. These and other publicly available algorithms are compared for outlier detection. Timings and estimator quality are also considered. An algorithm using the forward search (FS) has the best properties for both size and power of the outlier tests.
Statistical Methods and Applications | 2018
Domenico Perrotta; Francesca Torti
The forward search is a powerful general method for detecting multiple masked outliers and for determining their effect on inferences about models fitted to data. From the monitoring of a series of statistics based on subsets of data of increasing size we obtain multiple views of any hidden structure. One of the problems of the forward search has always been the lack of an automatic link among the great variety of plots which are monitored. Usually it happens that a lot of interesting features emerge unexpectedly during the progression of the forward search only when a specific combination of forward plots is inspected at the same time. Thus, the analyst should be able to interact with the plots and redefine or refine the links among them. In the absence of dynamic linking and interaction tools, the analyst risks to miss relevant hidden information. In this paper we fill this gap and provide the user with a set of new robust graphical tools whose power will be demonstrated on several regression problems. Through the analysis of real and simulated data we give a series of examples where dynamic interaction with different “robust plots” is used to highlight the presence of groups of outliers and regression mixtures and appraise the effect that these hidden groups exert on the fitted model.
Journal of Business & Economic Statistics | 2018
Lucio Barabesi; Andrea Cerasa; Andrea Cerioli; Domenico Perrotta
We describe empirical work in the domain of clustering and outlier detection, for the analysis of European trade data. It is our first attempt to evaluate benefits and limitations of the forward search approach for regression and multivariate analysis Atkinson and Riani (Robust diagnostic regression analysis, Springer, 2000), Atkinson et al. (Exploring multivariate data with the forward search, Springer, 2004), within a concrete application scenario and in relation to a comparable backward method developed in the JRC by Arsenis et al. (Price outliers in eu external trade data, Enlargement and Integration Workshop 2005, 2005). Our findings suggest that the automatic clustering based on Mahalanobis distances may be inappropriate in presence of a high-density area in the dataset. Follow up work is discussed extensively in Riani et al. (Fitting mixtures of regression lines with the forward search, Mining massive data sets for security, IOS, 2008).