Pedro C. Álvarez-Esteban
University of Valladolid
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pedro C. Álvarez-Esteban.
Annales De L Institut Henri Poincare-probabilites Et Statistiques | 2011
Pedro C. Álvarez-Esteban; E. del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán
For a given trimming level 2 (0,1) an trimmed version, P , of a probability P is a new probability obtained by re-weighting the probability of any Borel set, B, according a positive weight function, f 1 1 , in the way P (B) = R B f(x)P(dx). If P,Q are probability measures on an euclidean space, we consider the optimization problem of obtaining the best L2 Wasserstein approximation between say a fixed probability and trimmed versions of the other, say trimmed versions of both probabilities. These best trimmed approximations naturally lead to new perspectives in the theory of Mass Transportation, where a part of the mass could be not necessarily transported. Since optimal transportation plans are not easily computable, we provide theoretical support for Monte-Carlo approximations, through a general consistency result. As a remarkable and unexpected additional result, with important implications for future work, we obtain the uniqueness of the optimal solution. Notice that such solution involves an optimal map T transporting some trimmed version P of P to some other Q of Q, thus for any point x in the support of P the weight function associated to P allows to partially or completely avoid the consideration of x in the transport. Our results show that in fact only the non-trimmed points (verifying f(x) = 1 1 ) are transported, while the partially trimmed points (verifying 0 < f(x) < 1 1 ) must remain untransported by T.
Journal of the American Statistical Association | 2008
Pedro C. Álvarez-Esteban; Eustasio del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán
This article introduces an analysis of similarity of distributions based on the L2-Wasserstein distance between trimmed distributions. Our main innovation is the use of the impartial trimming methodology, already considered in robust statistics, which we adapt to this setup. Instead of simply removing data at the tails to provide some robustness to the similarity analysis, we develop a data-driven trimming method aimed at maximizing similarity between distributions. Dissimilarity is then measured in terms of the distance between the optimally trimmed distributions. We provide illustrative examples showing the improvements over previous approaches and give the relevant asymptotic results to justify the use of this methodology in applications.
Bernoulli | 2012
Pedro C. Álvarez-Esteban; Eustasio del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán
We say that two probabilities are similar at levelif they are contaminated versions (up to anfraction) of the same common probability. We show how this model is related to minimal distances between sets of trimmed probabilities. Empirical versions turn out to present an over- fitting effect in the sense that trimming beyond the similarity level results in trimmed samples that are closer than expected to each other. We show how this can be combined with a bootstrap approach to assess similarity from two data samples.
Journal of Applied Statistics | 2008
Cristina Rueda-Sabater; Pedro C. Álvarez-Esteban
In this paper, we introduce logistic models to analyse fertility curves. The models are formulated as linear models of the log odds of fertility and are defined in terms of parameters that are interpreted as measures of level, location and shape of the fertility schedule. This parameterization is useful for the evaluation, and interpretation of fertility trends and projections of future period fertility. For a series of years, the proposed models admit a state-space formulation that allows a coherent joint estimation of parameters and forecasting. The main features of the models compared with other alternatives are the functional simplicity, the flexibility, and the interpretability of the parameters. These and other features are analysed in this paper using examples and theoretical results. Data from different countries are analysed, and to validate the logistic approach, we compare the goodness of fit of the new model against well-known alternatives; the analysis gives superior results in most developed countries.
Bernoulli | 2018
Pedro C. Álvarez-Esteban; Eustasio del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán
We introduce a general theory for a consensus-based combination of estimations of probability measures. Potential applications include parallelized or distributed sampling schemes as well as variations on aggregation from resampling techniques like boosting or bagging. Taking into account the possibility of very discrepant estimations, instead of a full consensus we consider a “wide consensus” procedure. The approach is based on the consideration of trimmed barycenters in the Wasserstein space of probability measures. We provide general existence and consistency results as well as suitable properties of these robustified Fréchet means. In order to get quick applicability, we also include characterizations of barycenters of probabilities that belong to (non necessarily elliptical) location and scatter families. For these families we provide an iterative algorithm for the effective computation of trimmed barycenters, based on a consistent algorithm for computing barycenters, guarantying applicability in a wide setting of statistical problems. AMS Subject Classification: Primary: 60B05, 62F35, Secondary 62H12.
ASME 2014 33rd International Conference on Ocean, Offshore and Arctic Engineering | 2014
Carolina Euán; Joaquín Ortega; Pedro C. Álvarez-Esteban
The problem of detecting changes in the state of the sea is very important for the analysis and determination of wave climate in a given location. Wave measurements are frequently statistically analyzed as a time series, and segmentation algorithms developed in this context are used to determine change-points. However, most methods found in the literature consider the case of instantaneous changes in the time series, which is not usually the case for sea waves, where changes take a certain time interval to occur.We propose a new segmentation method that allows for the presence of transition intervals between successive stationary periods, and is based on the analysis of distances of normalized spectra to detect clusters in the time series. The series is divided into 30-minutes intervals and the spectral density is estimated for each one. The normalized spectra are compared using the Total Variation distance and a hierarchical clustering method is applied to the distance matrix. The information obtained from the clustering algorithm is used to classify the intervals as belonging to a stationary or a transition period We present simulation studies to validate the method and examples of applications to real data.© 2014 ASME
Statistical Science | 2017
Pedro C. Álvarez-Esteban; E. del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán
Research partially supported by the Spanish Ministerio de Economia y Competitividad y fondos FEDER, grants MTM2014-56235-C2-1-P and MTM2014-56235-C2-2, and by Consejeria de Educacion de la Junta de Castilla y Leon, grant VA212U13.
Computational Statistics & Data Analysis | 2013
Pedro C. Álvarez-Esteban; E. del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán
The grades of a Spanish university access exam involving 10 graders are analyzed. The interest focuses on finding the greatest group of graders showing similar grading patterns or, equivalently, on detecting if there are graders whose grades exhibit significant deviations from the pattern determined by the remaining graders. Due to differences in background of the involved students and graders, homogeneity is too strong to be considered as a realistic null model. Instead, the weaker similarity model, which seems to be more appropriate in this setting, is considered. To handle this problem, a statistical procedure designed to search for a hidden main pattern is developed. The procedure is based on the detection and deletion of the graders that are significantly non-similar to (the pooled mixture of) the others. This is performed through the use of a probability metric, a bootstrap approach and a stepwise search algorithm. Moreover, the procedure also allows one to identify which part of the grades of each grader makes her/him different from the others.
ieee aiaa digital avionics systems conference | 2017
Miguel A. Martínez-Prieto; Anibal Bregon; Ivan Garcia-Miranda; Pedro C. Álvarez-Esteban; Fernando Díaz; David Scarlatti
Flight cancellations, departure delays, congestion in taxi times and airborne holding delays are increasingly frequent problems that negatively impact the performance, fuel burn, emissions rate and customer satisfaction at major airports in the world. However, this is just a brushstroke of the future to come. The dramatic growth in the air traffic levels has become a problem of paramount importance, leading into an increased interest for enhancing the current Air Traffic Management (ATM) systems. The main objective is to being able to cope with the sustained air traffic growth under safe, economic, efficient and environmental friendly working conditions. The ADS-B (Automatic Dependent Surveillance — Broadcast) technology plays a major role in the new ATM systems, since it provides more accurate real-time positioning information than secondary radars, in spite of using a cheaper infrastructure. However, the main flaw in the use of ADS-B technology is the generation of large volumes of data, that, when merged with other flight-related information, faces important scalability issues. In this work, we start off from a previously developed data lake for the support of the full ADS-B data life-cycle in a scalable and cost-effective way, and propose a data architecture to integrate data from different providers and reconstruct flight trajectories that can ultimately be used to improve the efficiency in flight operations. This data architecture is also evaluated using a 2-week testbed which reports some interesting figures about its effectiveness.
Journal of Mathematical Analysis and Applications | 2016
Pedro C. Álvarez-Esteban; E. del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán