Daniel Peña
Charles III University of Madrid
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Daniel Peña.
Statistics & Probability Letters | 1997
Ana Justel; Daniel Peña; Ruben H. Zamar
This paper presents a distribution-free multivariate Kolmogorov-Smirnov goodness-of-fit test. The test uses a statistic which is built using Rosenblatts transformation and an algorithm is developed to compute it in the bivariate case. An approximate test, that can be easily computed in any dimension, is also presented. The power of these multivariate tests is studied in a simulation study.
Test | 1999
N. Locantore; J. S. Marron; Douglas G. Simpson; N. Tripoli; Jin-Ting Zhang; K. L. Cohen; Graciela Boente; Ricardo Fraiman; Babette A. Brumback; Christophe Croux; Jianqing Fan; Alois Kneip; John I. Marden; Daniel Peña; Javier Prieto; James O. Ramsay; Mariano J. Valderrama; Ana M. Aguilera
A method for exploring the structure of populations of complex objects, such as images, is considered. The objects are summarized by feature vectors. The statistical backbone is Principal Component Analysis in the space of feature vectors. Visual insights come from representing the results in the original data space. In an ophthalmological example, endemic outliers motivate the development of a bounded influence approach to PCA.
Technometrics | 2001
Daniel Peña; Francisco J. Prieto
In this article, we present a simple multivariate outlier-detection procedure and a robust estimator for the covariance matrix, based on the use of information obtained from projections onto the directions that maximize and minimize the kurtosis coefficient of the projected data. The properties of this estimator (computational cost, bias) are analyzed and compared with those of other robust estimators described in the literature through simulation studies. The performance of the outlier-detection procedure is analyzed by applying it to a set of well-known examples.
Journal of the American Statistical Association | 1987
Daniel Peña; George E. P. Box
Abstract This article studies how to identify hidden factors in multivariate time series process. This problem is important because, when the series are driven by a set of common factors, (a) a large number of parameters may be needed to obtain an adequate representation of the system and (b) the estimated parameters will be highly correlated. Therefore, a complex and badly defined relationship can appear when, in fact, a simpler and parsimonious model in terms of a few common factors can be operating. This article develops a methodology to identify the number of factors and to build a simplifying transformation to represent the series. It is proved that the number of factors is equal to the rank of the covariance matrices and the parameter matrices of the infinite moving average representation of the process. The eigenvectors of these matrices will provide the canonical transformation. The method is illustrated with one example, using series of the price of wheat in five provinces of Spain in the 19th ce...
Computational Statistics & Data Analysis | 2006
Jorge Caiado; Nuno Crato; Daniel Peña
The statistical discrimination and clustering literature has studied the problem of identifying similarities in time series data. Some studies use non-parametric approaches for splitting a set of time series into clusters by looking at their Euclidean distances in the space of points. A new measure of distance between time series based on the normalized periodogram is proposed. Simulation results comparing this measure with others parametric and non-parametric metrics are provided. In particular, the classification of time series as stationary or as non-stationary is discussed. The use of both hierarchical and non-hierarchical clustering algorithms is considered. An illustrative example with economic time series data is also presented.
Journal of the American Statistical Association | 2002
Daniel Peña; Julio Rodríguez
A new portmanteau test for time series, more powerful than the tests of Ljung and Box and Monti, is proposed. The test is based on the mth root of the determinant of the mth autocorrelation matrix. It is shown that the proposed statistic is a function of all of the squared multiple correlation coefficients of the regressions of the residuals on their lags when the number of lags goes from 1 to m. It can also be written as a function of the first m partial autocorrelation coefficients. The asymptotic distribution of the test statistic is a linear combination of chi-squared distributions and can be approximated by a gamma distribution. It is shown, depending on the model and sample size, that this test can be up to 50% more powerful than the Ljung and Box and Monti tests. The test is applied to the detection of several types of nonlinearity by using the autocorrelation matrix of the squared residuals, and it is shown that, in general, the new test is more powerful than the test of McLeod and Li. An example is presented in which this test finds nonlinearity in the residuals of the sunspot series.
Journal of Business & Economic Statistics | 1990
Daniel Peña
This article studies how to identify influential observations in univariate autoregressive integrated moving average time series models and how to measure their effects on the estimated parameters of the model. The sensitivity of the parameters to the presence of either additive or innovational outliers is analyzed, and influence statistics based on the Mahalanobis distance are presented. The statistic linked to additive outliers is shown to be very useful for indicating the robustness of the fitted model to the given data set. Its application is illustrated using a relevant set of historical data.
Journal of Statistical Planning and Inference | 2002
Andre’es M. Alonso; Daniel Peña; Juan Romo
In this paper we consider bootstrap methods for constructing nonparametric prediction intervals for a general class of linear processes. Our approach uses the sieve bootstrap procedure of Biihlmann (1997) based on residual resampling from an autoregressive approximation to the given process. We show that the sieve bootstrap provides consistent estimators of the conditional distribution of future values given the observed data, assuming that the order of the autoregressive approximation increases with the sample size at a suitable rate and some restrictions about polynomial decay of the coefficients ~ j t:o of the process MA(oo) representation. We present a Monte Carlo study comparing the finite sample properties of the sieve bootstrap with those of alternative methods. Finally, we illustrate the performance of the proposed method with real data examples.
Journal of Time Series Analysis | 2007
M. Angeles Carnero; Daniel Peña; Esther Ruiz
This paper analyses how outliers affect the identification of conditional heteroscedasticity and the estimation of generalized autoregressive conditionally heteroscedastic (GARCH) models. First, we derive the asymptotic biases of the sample autocorrelations of squared observations generated by stationary processes and show that the properties of some conditional homoscedasticity tests can be distorted. Second, we obtain the asymptotic and finite sample biases of the ordinary least squares (OLS) estimator of ARCH(p) models. The finite sample results are extended to generalized least squares (GLS), maximum likelihood (ML) and quasi-maximum likelihood (QML) estimators of ARCH(p) and GARCH(1,1) models. Finally, we show that the estimated asymptotic standard deviations are biased estimates of the sample standard deviations.
Journal of the American Statistical Association | 2001
Daniel Peña; Francisco J. Prieto
This article describes a procedure to identify clusters in multivariate data using information obtained from the univariate projections of the sample data onto certain directions. The directions are chosen as those that minimize and maximize the kurtosis coefficient of the projected data. It is shown that, under certain conditions, these directions provide the largest separation for the different clusters. The projected univariate data are used to group the observations according to the values of the gaps or spacings between consecutive-ordered observations. These groupings are then combined over all projection directions. The behavior of the method is tested on several examples, and compared to k-means, MCLUST, and the procedure proposed by Jones and Sibson in 1987. The proposed algorithm is iterative, affine equivariant, flexible, robust to outliers, fast to implement, and seems to work well in practice.