Pedro C. Álvarez-Esteban

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pedro C. Álvarez-Esteban is active.

Explore More

Publication

Featured researches published by Pedro C. Álvarez-Esteban.

Annales De L Institut Henri Poincare-probabilites Et Statistiques | 2011

Uniqueness and approximate computation of optimal incomplete transportation plans

Pedro C. Álvarez-Esteban; E. del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán

For a given trimming level 2 (0,1) an trimmed version, P , of a probability P is a new probability obtained by re-weighting the probability of any Borel set, B, according a positive weight function, f 1 1 , in the way P (B) = R B f(x)P(dx). If P,Q are probability measures on an euclidean space, we consider the optimization problem of obtaining the best L2 Wasserstein approximation between say a fixed probability and trimmed versions of the other, say trimmed versions of both probabilities. These best trimmed approximations naturally lead to new perspectives in the theory of Mass Transportation, where a part of the mass could be not necessarily transported. Since optimal transportation plans are not easily computable, we provide theoretical support for Monte-Carlo approximations, through a general consistency result. As a remarkable and unexpected additional result, with important implications for future work, we obtain the uniqueness of the optimal solution. Notice that such solution involves an optimal map T transporting some trimmed version P of P to some other Q of Q, thus for any point x in the support of P the weight function associated to P allows to partially or completely avoid the consideration of x in the transport. Our results show that in fact only the non-trimmed points (verifying f(x) = 1 1 ) are transported, while the partially trimmed points (verifying 0 < f(x) < 1 1 ) must remain untransported by T.

Journal of the American Statistical Association | 2008

Trimmed Comparison of Distributions

Pedro C. Álvarez-Esteban; Eustasio del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán

This article introduces an analysis of similarity of distributions based on the L2-Wasserstein distance between trimmed distributions. Our main innovation is the use of the impartial trimming methodology, already considered in robust statistics, which we adapt to this setup. Instead of simply removing data at the tails to provide some robustness to the similarity analysis, we develop a data-driven trimming method aimed at maximizing similarity between distributions. Dissimilarity is then measured in terms of the distance between the optimally trimmed distributions. We provide illustrative examples showing the improvements over previous approaches and give the relevant asymptotic results to justify the use of this methodology in applications.

Bernoulli | 2012

Similarity of samples and trimming

Pedro C. Álvarez-Esteban; Eustasio del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán

We say that two probabilities are similar at levelif they are contaminated versions (up to anfraction) of the same common probability. We show how this model is related to minimal distances between sets of trimmed probabilities. Empirical versions turn out to present an over- fitting effect in the sense that trimming beyond the similarity level results in trimmed samples that are closer than expected to each other. We show how this can be combined with a bootstrap approach to assess similarity from two data samples.

Journal of Applied Statistics | 2008

The analysis of age-specific fertility patterns via logistic models

Cristina Rueda-Sabater; Pedro C. Álvarez-Esteban

In this paper, we introduce logistic models to analyse fertility curves. The models are formulated as linear models of the log odds of fertility and are defined in terms of parameters that are interpreted as measures of level, location and shape of the fertility schedule. This parameterization is useful for the evaluation, and interpretation of fertility trends and projections of future period fertility. For a series of years, the proposed models admit a state-space formulation that allows a coherent joint estimation of parameters and forecasting. The main features of the models compared with other alternatives are the functional simplicity, the flexibility, and the interpretability of the parameters. These and other features are analysed in this paper using examples and theoretical results. Data from different countries are analysed, and to validate the logistic approach, we compare the goodness of fit of the new model against well-known alternatives; the analysis gives superior results in most developed countries.

Bernoulli | 2018

Wide consensus aggregation in the Wasserstein space. Application to location-scatter families

Pedro C. Álvarez-Esteban; Eustasio del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán

We introduce a general theory for a consensus-based combination of estimations of probability measures. Potential applications include parallelized or distributed sampling schemes as well as variations on aggregation from resampling techniques like boosting or bagging. Taking into account the possibility of very discrepant estimations, instead of a full consensus we consider a “wide consensus” procedure. The approach is based on the consideration of trimmed barycenters in the Wasserstein space of probability measures. We provide general existence and consistency results as well as suitable properties of these robustified Fréchet means. In order to get quick applicability, we also include characterizations of barycenters of probabilities that belong to (non necessarily elliptical) location and scatter families. For these families we provide an iterative algorithm for the effective computation of trimmed barycenters, based on a consistent algorithm for computing barycenters, guarantying applicability in a wide setting of statistical problems. AMS Subject Classification: Primary: 60B05, 62F35, Secondary 62H12.

ASME 2014 33rd International Conference on Ocean, Offshore and Arctic Engineering | 2014

Detecting Stationary Intervals for Random Waves Using Time Series Clustering

Carolina Euán; Joaquín Ortega; Pedro C. Álvarez-Esteban

The problem of detecting changes in the state of the sea is very important for the analysis and determination of wave climate in a given location. Wave measurements are frequently statistically analyzed as a time series, and segmentation algorithms developed in this context are used to determine change-points. However, most methods found in the literature consider the case of instantaneous changes in the time series, which is not usually the case for sea waves, where changes take a certain time interval to occur.We propose a new segmentation method that allows for the presence of transition intervals between successive stationary periods, and is based on the analysis of distances of normalized spectra to detect clusters in the time series. The series is divided into 30-minutes intervals and the spectral density is estimated for each one. The normalized spectra are compared using the Total Variation distance and a hierarchical clustering method is applied to the distance matrix. The information obtained from the clustering algorithm is used to classify the intervals as belonging to a stationary or a transition period We present simulation studies to validate the method and examples of applications to real data.© 2014 ASME

Statistical Science | 2017

Models for the Assessment of Treatment Improvement: The Ideal and the Feasible

Pedro C. Álvarez-Esteban; E. del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán

Research partially supported by the Spanish Ministerio de Economia y Competitividad y fondos FEDER, grants MTM2014-56235-C2-1-P and MTM2014-56235-C2-2, and by Consejeria de Educacion de la Junta de Castilla y Leon, grant VA212U13.

Computational Statistics & Data Analysis | 2013

Searching for a common pooling pattern among several samples

Pedro C. Álvarez-Esteban; E. del Barrio; Juan A. Cuesta-Albertos; Carlos Matrán

The grades of a Spanish university access exam involving 10 graders are analyzed. The interest focuses on finding the greatest group of graders showing similar grading patterns or, equivalently, on detecting if there are graders whose grades exhibit significant deviations from the pattern determined by the remaining graders. Due to differences in background of the involved students and graders, homogeneity is too strong to be considered as a realistic null model. Instead, the weaker similarity model, which seems to be more appropriate in this setting, is considered. To handle this problem, a statistical procedure designed to search for a hidden main pattern is developed. The procedure is based on the detection and deletion of the graders that are significantly non-similar to (the pooled mixture of) the others. This is performed through the use of a probability metric, a bootstrap approach and a stepwise search algorithm. Moreover, the procedure also allows one to identify which part of the grades of each grader makes her/him different from the others.

ieee aiaa digital avionics systems conference | 2017

Integrating flight-related information into a (Big) data lake

Miguel A. Martínez-Prieto; Anibal Bregon; Ivan Garcia-Miranda; Pedro C. Álvarez-Esteban; Fernando Díaz; David Scarlatti

Flight cancellations, departure delays, congestion in taxi times and airborne holding delays are increasingly frequent problems that negatively impact the performance, fuel burn, emissions rate and customer satisfaction at major airports in the world. However, this is just a brushstroke of the future to come. The dramatic growth in the air traffic levels has become a problem of paramount importance, leading into an increased interest for enhancing the current Air Traffic Management (ATM) systems. The main objective is to being able to cope with the sustained air traffic growth under safe, economic, efficient and environmental friendly working conditions. The ADS-B (Automatic Dependent Surveillance — Broadcast) technology plays a major role in the new ATM systems, since it provides more accurate real-time positioning information than secondary radars, in spite of using a cheaper infrastructure. However, the main flaw in the use of ADS-B technology is the generation of large volumes of data, that, when merged with other flight-related information, faces important scalability issues. In this work, we start off from a previously developed data lake for the support of the full ADS-B data life-cycle in a scalable and cost-effective way, and propose a data architecture to integrate data from different providers and reconstruct flight trajectories that can ultimately be used to improve the efficiency in flight operations. This data architecture is also evaluated using a 2-week testbed which reports some interesting figures about its effectiveness.

Journal of Mathematical Analysis and Applications | 2016