Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Matias Salibian-Barrera is active.

Publication


Featured researches published by Matias Salibian-Barrera.


Journal of Computational and Graphical Statistics | 2006

A Fast Algorithm for S-Regression Estimates

Matias Salibian-Barrera; Victor J. Yohai

Equivariant high-breakdown point regression estimates are computationally expensive, and the corresponding algorithms become unfeasible for moderately large number of regressors. One important advance to improve the computational speed of one such estimator is the fast-LTS algorithm. This article proposes an analogous algorithm for computing S-estimates. The new algorithm, that we call “fast-S”, is also based on a “local improvement” step of the resampling initial candidates. This allows for a substantial reduction of the number of candidates required to obtain a good approximation to the optimal solution. We performed a simulation study which shows that S-estimators computed with the fast-S algorithm compare favorably to the LTS-estimators computed with the fast-LTS algorithm.


Journal of the American Statistical Association | 2006

Principal Components Analysis Based on Multivariate MM Estimators With Fast and Robust Bootstrap

Matias Salibian-Barrera; Stefan Van Aelst; Gert Willems

We consider robust principal components analysis (PCA) based on multivariate MM estimators. We first study the robustness and efficiency of these estimators, particularly in terms of eigenvalues and eigenvectors. We then focus on inference procedures based on a fast and robust bootstrap for MM estimators. This method is an alternative to the approach based on the asymptotic distribution of the estimators and can also be used to assess the stability of the principal components. A formal consistency proof for the bootstrap method is given, and its finite-sample performance is investigated through simulations. We illustrate the use of the robust PCA and the bootstrap inference on a real dataset.


Statistical Methods and Applications | 2008

Fast and robust bootstrap

Matias Salibian-Barrera; Stefan Van Aelst; Gert Willems

In this paper we review recent developments on a bootstrap method for robust estimators which is computationally faster and more resistant to outliers than the classical bootstrap. This fast and robust bootstrap method is, under reasonable regularity conditions, asymptotically consistent. We describe the method in general and then consider its application to perform inference based on robust estimators for the linear regression and multivariate location-scatter models. In particular, we study confidence and prediction intervals and tests of hypotheses for linear regression models, inference for location-scatter parameters and principal components, and classification error estimation for discriminant analysis.


Journal of Statistical Computation and Simulation | 2007

On tests for multivariate normality and associated simulation studies

Patrick J. Farrell; Matias Salibian-Barrera; Katarzyna Naczk

We study the empirical size and power of some recently proposed tests for multivariate normality (MVN) and compare them with the existing proposals that performed best in previously published studies. We show that the Roystons [Royston, J.P., 1983b, Some techniques for assessing multivariate normality based on the Shapiro-Wilk W. Applied Statistics, 32, 121–133.] extension to the Shapiro and Wilk [Shapiro, S.S., Wilk, M.B., 1965, An analysis of variance test for normality (complete samples). Biometrika, 52, 591–611.] test is unable to achieve the nominal significance level, and consider a subsequent extension proposed by Royston [Royston, J.P., 1992, Approximating the Shapiro–Wilk W-Test for non-normality. Statistics and Computing, 2, 117–119.] to correct this problem, which earlier studies appear to have ignored. A consistent and invariant test proposed by Henze and Zirkler [Henze, N., Zirkler, B., 1990, A class of invariant consistent tests for multivariate normality. Communications in Statistics—Theory and Methods, 19, 3595–3617.] is found to have good power properties, particularly for sample sizes of 75 or more, while an approach suggested by Royston [Royston, J.P., 1992, Approximating the Shapiro–Wilk W-Test for non-normality. Statistics and Computing, 2, 117–119.] performs effectively at detecting departures from MVN for smaller sample sizes. We also compare our results to those of previous simulation studies, and discuss the challenges associated with generating multivariate data for such investigations.


Computational Statistics & Data Analysis | 2008

Robust model selection using fast and robust bootstrap

Matias Salibian-Barrera; Stefan Van Aelst

Robust model selection procedures control the undue influence that outliers can have on the selection criteria by using both robust point estimators and a bounded loss function when measuring either the goodness-of-fit or the expected prediction error of each model. Furthermore, to avoid favoring over-fitting models, these two measures can be combined with a penalty term for the size of the model. The expected prediction error conditional on the observed data may be estimated using the bootstrap. However, bootstrapping robust estimators becomes extremely time consuming on moderate to high dimensional data sets. It is shown that the expected prediction error can be estimated using a very fast and robust bootstrap method, and that this approach yields a consistent model selection method that is computationally feasible even for a relatively large number of covariates. Moreover, as opposed to other bootstrap methods, this proposal avoids the numerical problems associated with the small bootstrap samples required to obtain consistent model selection criteria. The finite-sample performance of the fast and robust bootstrap model selection method is investigated through a simulation study while its feasibility and good performance on moderately large regression models are illustrated on several real data examples.


Archive | 2000

Contributions to the theory of robust inference

Matias Salibian-Barrera

We study the problem of performing statistical inference based on robust estimates when the distribution of the data is only assumed to belong to a contamination neighbourhood of a known central distribution. We start by determining the asymptotic properties of some robust estimates when the data are not generated by the central distribution of the contamination neighbourhood. Under certain regularity conditions the considered estimates are consistent and asymptotically normal. For the location model and with additional regularity conditions we show that the convergence is uniform on the contamination neighbourhood. We determine that a class of robust estimates satisfies these requirements for certain proportions of contamination, and that there is a trade-off between the robustness of the estimates and the extent to which the uniformity of their asymptotic properties holds. When the distribution of the data is not the central distribution of the neighbourhood the asymptotic variance of these estimates is involved and difficult to estimate. This problem affects the performance of inference methods based on the empirical estimates of the asymptotic variance. We present a new re-sampling method based on Efrons bootstrap (Efron, 1979) to estimate the sampling distribution of MM-location and regression estimates. i i This method overcomes the main drawbacks of the use of bootstrap with robust estimates on large and potentially contaminated data sets. We show that our proposal is computationally simple and that it provides stable estimates when the data contain outliers. This new method extends naturally to the linear regression model.


Journal of the American Statistical Association | 2011

An Outlier-Robust Fit for Generalized Additive Models With Applications to Disease Outbreak Detection

Azadeh Alimadad; Matias Salibian-Barrera

We are interested in a class of unsupervised methods to detect possible disease outbreaks, that is, rapid increases in the number of cases of a particular disease that deviate from the pattern observed in the past. The motivating application for this article deals with detecting outbreaks using generalized additive models (GAMs) to model weekly counts of certain infectious diseases. We can use the distance between the predicted and observed counts for a specific week to determine whether an important departure has occurred. Unfortunately, this approach may not work as desired because GAMs can be very sensitive to the presence of a small proportion of observations that deviate from the assumed model. Thus, the outbreak may affect the predicted values causing these to be close to the atypical counts, and thus mask the outliers by having them appear not to be too extreme or atypical. We illustrate this phenomenon with influenza-like-illness doctor-visits data from the United States for the 2006–2008 flu seasons. One way to avoid this masking problem is to derive an algorithm to fit GAM models that can resist the effect of a small number of atypical observations. In this article we discuss such an outlier-robust fit for GAMs based on the backfitting algorithm. The basic idea is to replace the maximum likelihood based weights used in the generalized local scoring algorithm with those derived from robust quasi-likelihood equations (Cantoni and Ronchetti 2001b). These robust estimators for generalized linear models work well for the Poisson family of distributions, and also for binomial distributions with relatively large numbers of trials. We show that the resulting estimated mean function is resistant to the presence of outliers in the response variable and that it also remains close to the usual GAM estimator when the data do not contain atypical observations. We illustrate the use of this approach on the detection of the recent outbreak of H1N1 flu by looking at the weekly counts of influenza-like-illness (ILI) doctor visits, as reported through the U.S. Outpatient Influenza-like Illness Surveillance Network (ILINet), and also apply our method to the numbers of requested isolates in Canada. Weeks with a sudden increase in ILI visits or requested isolates are much more clearly identified as atypical by the robust fit because the observed counts are far from the ones predicted by the fitted GAM model.


Journal of Computational and Graphical Statistics | 2008

The Fast-τ Estimator for Regression

Matias Salibian-Barrera; Gert Willems; Ruben H. Zamar

Yohai and Zamars τ-estimators of regression have excellent statistical properties but are nevertheless rarely used in practice because of a lack of available software and the general impression that τ-estimators are difficult to approximate. We will show, however, that the computational difficulties of approximating τ-estimators are similar in nature to those of the more popular S-estimators. The main goal of this article is to compare an approximating algorithm for τ-estimators based on random resampling with some alternative heuristic search algorithms. We show that the former is not only simpler, but that when enhanced by local improvement steps it generally outperforms the consider edheuristic search algorithms, even when the seheuristic algorithms also incorporate local improvement steps. Additionally, we show that the random resampling algorithm for approximating τ-estimators has favorable statistical properties compared to the analogous and widely used algorithms for S- and least trimmed squares estimators.


Journal of Computational and Graphical Statistics | 2010

S-Estimation for Penalized Regression Splines

Kukatharmini Tharmaratnam; Gerda Claeskens; Christophe Croux; Matias Salibian-Barrera

This article is about S-estimation for penalized regression splines. Penalized regression splines are one of the currently most used methods for smoothing noisy data. The estimation method used for fitting such a penalized regression spline model is mostly based on least squares methods, which are known to be sensitive to outlying observations. In real-world applications, outliers are quite commonly observed. There are several robust estimation methods taking outlying observations into account. We define and study S-estimators for penalized regression spline models. Hereby we replace the least squares estimation method for penalized regression splines by a suitable S-estimation method. By keeping the modeling by means of splines and by keeping the penalty term, though using S-estimators instead of least squares estimators, we arrive at an estimation method that is both robust and flexible enough to capture nonlinear trends in the data. Simulated data and a real data example are used to illustrate the effectiveness of the procedure. Software code (for use with R) is available online.


Annals of Statistics | 2004

Uniform asymptotics for robust location estimates when the scale is unknown

Matias Salibian-Barrera; Ruben H. Zamar

Most asymptotic results for robust estimates rely on regularity conditions that are difficult to verify and that real data sets rarely satisfy. Moreover, these results apply to fixed distribution functions. In the robustness context the distribution of the data remains largely unspecified and hence results that hold uniformly over a set of possible distribution functions are of theoretical and practical interest. In this paper we study the problem of obtaining verifiable and realistic conditions that suffice to obtain uniform consistency and uniform asymptotic normality for location robust estimates when the scale of the errors is unknown. We study M-location estimates calculated withan S-scale and we obtain uniform asymptotic results over contamination neighbourhoods. There is a trade-off between the size of these neighbourhoods and the breakdown point of the scale estimate. We also show how to calculate the maximum size of the contamination neighbourhoods where these uniform results hold.

Collaboration


Dive into the Matias Salibian-Barrera's collaboration.

Top Co-Authors

Avatar

Stefan Van Aelst

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Ruben H. Zamar

University of British Columbia

View shared research outputs
Top Co-Authors

Avatar

Victor J. Yohai

University of Buenos Aires

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christophe Croux

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Jorge Adrover

National University of Cordoba

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Marek Omelka

Charles University in Prague

View shared research outputs
Top Co-Authors

Avatar

Gerda Claeskens

Katholieke Universiteit Leuven

View shared research outputs
Researchain Logo
Decentralizing Knowledge