Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marco Riani is active.

Publication


Featured researches published by Marco Riani.


Archive | 2004

Exploring multivariate data with the forward search

Anthony C. Atkinson; Marco Riani; Andrea Cerioli

Contents Preface Notation 1 Examples of Multivariate Data 1.1 In.uence, Outliers and Distances 1.2 A Sketch of the Forward Search 1.3 Multivariate Normality and our Examples 1.4 Swiss Heads 1.5 National Track Records forWomen 1.6 Municipalities in Emilia-Romagna 1.7 Swiss Bank Notes 1.8 Plan of the Book 2 Multivariate Data and the Forward Search 2.1 The Univariate Normal Distribution 2.1.1 Estimation 2.1.2 Distribution of Estimators 2.2 Estimation and the Multivariate Normal Distribution 2.2.1 The Multivariate Normal Distribution 2.2.2 The Wishart Distribution 2.2.3 Estimation of O 2.3 Hypothesis Testing 2.3.1 Hypotheses About the Mean 2.3.2 Hypotheses About the Variance 2.4 The Mahalanobis Distance 2.5 Some Deletion Results 2.5.1 The Deletion Mahalanobis Distance 2.5.2 The (Bartlett)-Sherman-Morrison-Woodbury Formula 2.5.3 Deletion Relationships Among Distances 2.6 Distribution of the Squared Mahalanobis Distance 2.7 Determinants of Dispersion Matrices and the Squared Mahalanobis Distance 2.8 Regression 2.9 Added Variables in Regression 2.10 TheMean Shift OutlierModel 2.11 Seemingly Unrelated Regression 2.12 The Forward Search 2.13 Starting the Search 2.13.1 The Babyfood Data 2.13.2 Robust Bivariate Boxplots from Peeling 2.13.3 Bivariate Boxplots from Ellipses 2.13.4 The Initial Subset 2.14 Monitoring the Search 2.15 The Forward Search for Regression Data 2.15.1 Univariate Regression 2.15.2 Multivariate Regression 2.16 Further Reading 2.17 Exercises 2.18 Solutions 3 Data from One Multivariate Distribution 3.1 Swiss Heads 3.2 National Track Records for Women 3.3 Municipalities in Emilia-Romagna 3.4 Swiss Bank Notes 3.5 What Have We Seen? 3.6 Exercises 3.7 Solutions 4 Multivariate Transformations to Normality 4.1 Background 4.2 An Introductory Example: the Babyfood Data 4.3 Power Transformations to Approximate Normality 4.3.1 Transformation of the Response in Regression 4.3.2 Multivariate Transformations to Normality 4.4 Score Tests for Transformations 4.5 Graphics for Transformations 4.6 Finding a Multivariate Transformation with the Forward Search 4.7 Babyfood Data 4.8 Swiss Heads 4.9 Horse Mussels 4.10 Municipalities in Emilia-Romagna 4.10.1 Demographic Variables 4.10.2 Wealth Variables 4.10.3 Work Variables 4.10.4 A Combined Analysis 4.11 National Track Records for Women 4.12 Dyestuff Data 4.13 Babyfood Data and Variable Selection 4.14 Suggestions for Further Reading 4.15 Exercises 4.16 Solutions 5 Principal Components Analysis 5.1 Background 5.2 Principal Components and Eigenvectors 5.2.1 Linear Transformations and Principal Components . 5.2.2 Lack of Scale Invariance and Standardized Variables 5.2.3 The Number of Components 5.3 Monitoring the Forward Search 5.3.1 Principal Components and Variances 5.3.2 Principal Component Scores 5.3.3 Correlations Between Variables and Principal Components 5.3.4 Elements of the Eigenvectors 5.4 The Biplot and the Singular Value Decomposition 5.5 Swiss Heads 5.6 Milk Data 5.7 Quality of Life 5.8 Swiss Bank Notes 5.8.1 Forgeries and Genuine Notes 5.8.2 Forgeries Alone 5.9 Municipalities in Emilia-Romagna 5.10 Further reading 5.11 Exercises 5.12 Solutions 6 Discriminant Analysis 6.1 Background 6.2 An Outline of Discriminant Analysis 6.2.1 Bayesian Discrimination 6.2.2 Quadratic Discriminant Analysis 6.2.3 Linear Discriminant Analysis 6.2.4 Estimation of Means and Variances 6.2.5 Canonical Variates 6.2.6 Assessment of Discriminant Rules 6.3 The Forward Search 6.3.1 Step 1: Choice of the Initial Subset 6.3.2 Step 2: Adding


Computational Statistics & Data Analysis | 1998

Robust bivariate boxplots and multiple outlier detection

Sergio Zani; Marco Riani; Aldo Corbellini

Abstract In this paper we suggest a simple way of constructing a bivariate boxplot based on convex hull peeling and B-spline smoothing. The proposed method shows some advantages with respect to that suggested by Goldberg and Iglewicz (1992). Our approach leads to defining a natural inner region which is completely nonparametric and smooth. Furthermore it retains the correlation in the observations and adapts to differing spread of the data in the different directions. Theouter contour, which is based on a multiple of the distance of the inner region from the centre, is robust to the presence of clusters of outliers. We also show how the construction of a bivariate boxplot for each pair of variables can become a very useful tool for the detection of multivariate outliers.


Computational Statistics & Data Analysis | 2007

Exploratory tools for clustering multivariate data

Anthony C. Atkinson; Marco Riani

The forward search provides a series of robust parameter estimates based on increasing numbers of observations. The resulting series of robust Mahalanobis distances is used to cluster multivariate normal data. The method depends on envelopes of the distribution of the test statistics in forward plots. These envelopes can be found by simulation; flexible polynomial approximations to the envelopes are given. New graphical tools provide methods not only of detecting clusters but also of determining their membership. Comparisons are made with mclust and k-means clustering.


Journal of Computational and Graphical Statistics | 1999

The Ordering of Spatial Data and the Detection of Multiple Outliers

Andrea Cerioli; Marco Riani

Abstract In this article we suggest a unified approach to the exploratory analysis of spatial data. Our technique is based on a forward search algorithm that orders the observations from those most in agreement with a specified autocorrelation model to those least in agreement with it. This leads to the identification of spatial outliers—that is, extreme observations with respect to their neighboring values—and of nonstationary pockets. In particular, the focus of our analysis is on spatial prediction models. We show that standard deletion diagnostics for prediction are affected by masking and swamping problems when multiple outliers are present. The effectiveness of the suggested method in detecting masked multiple outliers, and more generally in ordering spatial data, is shown by means of a number of simulated datasets. These examples clearly reveal the power of our method in getting inside the data in a way which is more simple and powerful than it would be using standard diagnostic procedures. Further...


Journal of Computational and Graphical Statistics | 2006

Distribution Theory and Simulations for Tests of Outliers in Regression

Anthony C. Atkinson; Marco Riani

This article provides distributional results for testing multiple outliers in regression. Because direct simulation of each combination of number of observations and number of parameters is too time consuming, three straightforward methods using truncated simple samples are described for approximating the pointwise distribution of the test statistic. Scaling factors are found to adjust for the number of parameters. The same simulations also provide a powerful method of calibrating pointwise inferences for simultaneous tests for an unknown number of outliers. Analysis of data on fidelity cards reveals an unexpected group of outliers.


Journal of Computational and Graphical Statistics | 2001

A Unified Approach to Outliers, Inuence, and Transformations in Discriminant Analysis

Marco Riani; Anthony C. Atkinson

This article extends the analysis of multivariate transformations to linear and quadratic discriminant analysis. It shows that the standard application of deletion diagnostic techniques for validating a particular transformation suffers from masking and so may fail if several outliers are present. We therefore suggest a simple and powerful method which is based on a forward search algorithm. This robust diagnostic procedure orders the observations from those most in agreement with the suggested model to those least in agreement with it. It provides a unified approach to the detection of inuential observations and outliers in discriminant analysis. Simulated and real data are used to show the necessity of considering multivariate transformations in discriminant analysis. The examples demonstrate the power of the suggested approach in revealing the correct structure of the data when this is obscured by outliers.


Technometrics | 2000

Robust Diagnostic Data Analysis: Transformations in Regression

Marco Riani; Anthony C. Atkinson

We introduce a very general “forward” method of data analysis that starts from a small, robustly chosen subset of the data and shows the effect of adding observations by a forward search. Powerful diagnostic procedures result: The observations are ordered by their agreement with the proposed transformation, masking is overcome, and the inferential effect of each observation is clear. We apply the resulting method to the transformation of both univariate and multivariate data. Other applications of the forward search are mentioned.


Advanced Data Analysis and Classification | 2007

Fast calibrations of the forward search for testing multiple outliers in regression

Marco Riani; Anthony C. Atkinson

The paper considers the problem of testing for multiple outliers in a regression model and provides fast approximations to the null distribution of the minimum deletion residual used as a test statistic. Since direct simulation of each combination of number of observations and number of parameters is too time consuming, methods using simple normal samples are described for approximating the pointwise distribution of the test statistic. One approximation is based on adjustments to the results of simple simulations. The other uses properties of order statistics from folded t distributions to move outside the significance levels available by simulation. Analyses of data with beta errors and of transformed data on survival times demonstrate the usefulness in graphical methods of the inclusion of our bounds.


Archive | 2006

Random Start Forward Searches with Envelopes for Detecting Clusters in Multivariate Data

Anthony B. Atkinson; Marco Riani; Andrea Cerioli

During a forward search the plot of minimum Mahalanobis distances of observations not in the subset provides a test for outliers. However, if clusters are present in the data, their simple identification requires that there arc searches that initially include a preponderance of observations from each of the unknown clusters. We use random starts to provide such searches, combined with simulation envelopes for precise inference about clustering.


Electronic Journal of Statistics | 2014

Monitoring robust regression

Marco Riani; Andrea Cerioli; Anthony C. Atkinson; Domenico Perrotta

Robust methods are little applied (although much studied by statisticians). We monitor very robust regression by looking at the be- haviour of residuals and test statistics as we smoothly change the robustness of parameter estimation from a breakdown point of 50% to non-robust least squares. The resulting procedure provides insight into the structure of the data including outliers and the presence of more than one population. Moni- toring overcomes the hindrances to the routine adoption of robust methods, being informative about the choice between the various robust procedures. Methods tuned to give nominal high efficiency fail with our most compli- cated example. We find that the most informative analyses come from S estimates combined with Tukeys biweight or with the optimalfunctions. For our major example with 1,949 observations and 13 explanatory vari- ables, we combine robust S estimation with regression using the forward search, so obtaining an understanding of the importance of individual obser- vations, which is missing from standard robust procedures. We discover that the data come from two different populations. They also contain six outliers. Our analyses are accompanied by numerous graphs. Algebraic results are contained in two appendices, the second of which provides useful new results on the absolute odd moments of elliptically truncated multivariate normal random variables.

Collaboration


Dive into the Marco Riani's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anthony C. Atkinson

London School of Economics and Political Science

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anthony B. Atkinson

London School of Economics and Political Science

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge