Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Carlos Matrán is active.

Publication


Featured researches published by Carlos Matrán.


Annals of Statistics | 2008

A general trimming approach to robust cluster Analysis

Luis Angel García-Escudero; Alfonso Gordaliza; Carlos Matrán; Agustín Mayo-Iscar

We introduce a new method for performing clustering with the aim of fitting clusters with different scatters and weights. It is de- signed by allowing to handle a proportionof contaminating data to guarantee the robustness of the method. As a characteristic fea- ture, restrictions on the ratio between the maximum and the mini- mum eigenvalues of the groups scatter matrices are introduced. This makes the problem to be well defined and guarantees the consistency of the sample solutions to the population ones. The method covers a wide range of clustering approaches depend- ing on the strength of the chosen restrictions. Our proposal includes an algorithm for approximately solving the sample problem.


Advanced Data Analysis and Classification | 2010

A review of robust clustering methods

Luis Angel García-Escudero; Alfonso Gordaliza; Carlos Matrán; Agustín Mayo-Iscar

Deviations from theoretical assumptions together with the presence of certain amount of outlying observations are common in many practical statistical applications. This is also the case when applying Cluster Analysis methods, where those troubles could lead to unsatisfactory clustering results. Robust Clustering methods are aimed at avoiding these unsatisfactory results. Moreover, there exist certain connections between robust procedures and Cluster Analysis that make Robust Clustering an appealing unifying framework. A review of different robust clustering approaches in the literature is presented. Special attention is paid to methods based on trimming which try to discard most outlying data when carrying out the clustering process.


Probability Theory and Related Fields | 1988

The strong law of large numbers for k-means and best possible nets of Banach valued random variables

Juan A. Cuesta; Carlos Matrán

SummaryLet B be a uniformly convex Banach space, X a B-valued random variable and k a given positive integer number. A random sample of X is substituted by the set of k elements which minimizes a criterion. We found conditions to assure that this set converges a.s., as the sample size increases, to the set of k-elements which minimizes the same criterion for X.


Test | 2000

Contributions of empirical and quantile processes to the asymptotic theory of goodness-of-fit tests

Eustasio del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán; Sándor Csörgö; Carles M. Cuadras; Tertius de Wet; Evarist Giné; Richard A. Lockhart; Axel Munk; Winfried Stute

This paper analyzes the evolution of the asymptotic theory of goodness-of-fit tests. We emphasize the parallel development of this theory and the theory of empirical and quantile processes. Our study includes the analysis of the main tests of fit based on the empirical distribution function, that is, tests of the Cramér-von Mises or Kolmogorov-Smirnov type. We pay special attention to the problem of testing fit to a location scale family. We provide a new approach, based on the Wasserstein distance, to correlation and regression tests, outlining some of their properties and explaining their limitations.


Journal of Computational and Graphical Statistics | 2003

Trimming Tools in Exploratory Data Analysis

Luis Angel García-Escudero; Alfonso Gordaliza; Carlos Matrán

Exploratory graphical tools based on trimming are proposed for detecting main clusters in a given dataset. The trimming is obtained by resorting to trimmed k-means methodology. The analysis always reduces to the examination of real valued curves, even in the multivariate case. As the technique is based on a robust clustering criterium, it is able to handle the presence of different kinds of outliers. An algorithm is proposed to carry out this (computer intensive) method. As with classical k-means, the method is specially oriented to mixtures of spherical distributions. A possible generalization is outlined to overcome this drawback.


Statistics and Computing | 2011

Exploring the number of groups in robust model-based clustering

Luis Angel García-Escudero; Alfonso Gordaliza; Carlos Matrán; Agustín Mayo-Iscar

Two key questions in Clustering problems are how to determine the number of groups properly and measure the strength of group-assignments. These questions are specially involved when the presence of certain fraction of outlying data is also expected.Any answer to these two key questions should depend on the assumed probabilistic-model, the allowed group scatters and what we understand by noise. With this in mind, some exploratory “trimming-based” tools are presented in this work together with their justifications. The monitoring of optimal values reached when solving a robust clustering criteria and the use of some “discriminant” factors are the basis for these exploratory tools.


Computational Statistics & Data Analysis | 2007

The random projection method in goodness of fit for functional data

Juan A. Cuesta-Albertos; E. del Barrio; Ricardo Fraiman; Carlos Matrán

The possibility of considering random projections to identify probability distributions belonging to parametric families is explored. The results are based on considerations involving invariance properties of the family of distributions as well as on the random way of choosing the projections. In particular, it is shown that if a one-dimensional (suitably) randomly chosen projection is Gaussian, then the distribution is Gaussian. In order to show the applicability of the methodology some goodness-of-fit tests based on these ideas are designed. These tests are computationally feasible through the bootstrap setup, even in the functional framework. Simulations providing power comparisons of these projections-based tests with other available tests of normality, as well as to test the Black-Scholes model for a stochastic process are presented.


Statistics & Probability Letters | 1996

On the unconditional strong law of large numbers for the bootstrap mean

Eusebio Arenal-Gutiérrez; Carlos Matrán; Juan A. Cuesta-Albertos

We first analyze some results by Athreya (1983) and Csorgo (1992). Then, by taking into account the different rates of convergence of the resampling size, we give new, simple proofs of those results. We provide examples that show that the sizes of resampling required by our results to ensure a.s. convergence are not far from being optimal.


Annales De L Institut Henri Poincare-probabilites Et Statistiques | 2011

Uniqueness and approximate computation of optimal incomplete transportation plans

Pedro C. Álvarez-Esteban; E. del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán

For a given trimming level 2 (0,1) an trimmed version, P , of a probability P is a new probability obtained by re-weighting the probability of any Borel set, B, according a positive weight function, f 1 1 , in the way P (B) = R B f(x)P(dx). If P,Q are probability measures on an euclidean space, we consider the optimization problem of obtaining the best L2 Wasserstein approximation between say a fixed probability and trimmed versions of the other, say trimmed versions of both probabilities. These best trimmed approximations naturally lead to new perspectives in the theory of Mass Transportation, where a part of the mass could be not necessarily transported. Since optimal transportation plans are not easily computable, we provide theoretical support for Monte-Carlo approximations, through a general consistency result. As a remarkable and unexpected additional result, with important implications for future work, we obtain the uniqueness of the optimal solution. Notice that such solution involves an optimal map T transporting some trimmed version P of P to some other Q of Q, thus for any point x in the support of P the weight function associated to P allows to partially or completely avoid the consideration of x in the transport. Our results show that in fact only the non-trimmed points (verifying f(x) = 1 1 ) are transported, while the partially trimmed points (verifying 0 < f(x) < 1 1 ) must remain untransported by T.


Journal of the American Statistical Association | 2008

Trimmed Comparison of Distributions

Pedro C. Álvarez-Esteban; Eustasio del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán

This article introduces an analysis of similarity of distributions based on the L2-Wasserstein distance between trimmed distributions. Our main innovation is the use of the impartial trimming methodology, already considered in robust statistics, which we adapt to this setup. Instead of simply removing data at the tails to provide some robustness to the similarity analysis, we develop a data-driven trimming method aimed at maximizing similarity between distributions. Dissimilarity is then measured in terms of the distance between the optimally trimmed distributions. We provide illustrative examples showing the improvements over previous approaches and give the relevant asymptotic results to justify the use of this methodology in applications.

Collaboration


Dive into the Carlos Matrán's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

E. del Barrio

University of Valladolid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

A. Tuerodiaz

University of Cantabria

View shared research outputs
Researchain Logo
Decentralizing Knowledge