Jun Yan
University of Connecticut
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jun Yan.
Statistics and Computing | 2011
Ivan Kojadinovic; Jun Yan
Recent large scale simulations indicate that a powerful goodness-of-fit test for copulas can be obtained from the process comparing the empirical copula with a parametric estimate of the copula derived under the null hypothesis. A first way to compute approximate p-values for statistics derived from this process consists of using the parametric bootstrap procedure recently thoroughly revisited by Genest and Rémillard. Because it heavily relies on random number generation and estimation, the resulting goodness-of-fit test has a very high computational cost that can be regarded as an obstacle to its application as the sample size increases. An alternative approach proposed by the authors consists of using a multiplier procedure. The study of the finite-sample performance of the multiplier version of the goodness-of-fit test for bivariate one-parameter copulas showed that it provides a valid alternative to the parametric bootstrap-based test while being orders of magnitude faster. The aim of this work is to extend the multiplier approach to multivariate multiparameter copulas and study the finite-sample performance of the resulting test. Particular emphasis is put on elliptical copulas such as the normal and the t as these are flexible models in a multivariate setting. The implementation of the procedure for the latter copulas proves challenging and requires the extension of the Plackett formula for the t distribution to arbitrary dimension. Extensive Monte Carlo experiments, which could be carried out only because of the good computational properties of the multiplier approach, confirm in the multivariate multiparameter context the satisfactory behavior of the goodness-of-fit test.
Bernoulli | 2011
Christian Genest; Ivan Kojadinovic; Johanna Nešlehová; Jun Yan
Resume: It is often reasonable to assume that the dependence structure of a bivariate continuous distribution belongs to the class of extreme-value copulas. The latter are characterized by their Pickands dependence function. The talk is concerned with a procedure for testing whether this function belongs to a given parametric family. The test is based on a Cramer-von Mises statistic measuring the distance between an estimate of the parametric Pickands dependence function and either one of two nonparametric estimators thereof studied by Genest and Segers (2009). As the limiting distribution of the test statistic depends on unknown parameters, it must be estimated via a parametric bootstrap procedure, whose validity is established. Monte Carlo simulations are used to assess the power of the test, and an extension to dependence structures that are left-tail decreasing in both variables is considered.
Statistics and Computing | 2007
Jun Yan; Mary Kathryn Cowles; Shaowen Wang; Marc P. Armstrong
Abstract When MCMC methods for Bayesian spatiotemporal modeling are applied to large geostatistical problems, challenges arise as a consequence of memory requirements, computing costs, and convergence monitoring. This article describes the parallelization of a reparametrized and marginalized posterior sampling (RAMPS) algorithm, which is carefully designed to generate posterior samples efficiently. The algorithm is implemented using the Parallel Linear Algebra Package (PLAPACK). The scalability of the algorithm is investigated via simulation experiments that are implemented using a cluster with 25 processors. The usefulness of the method is illustrated with an application to sulfur dioxide concentration data from the Air Quality System database of the U.S. Environmental Protection Agency.
Journal of Multivariate Analysis | 2010
Ivan Kojadinovic; Jun Yan
A new class of tests of extreme-value dependence for bivariate copulas is proposed. It is based on the process comparing the empirical copula with a natural nonparametric rank-based estimator of the unknown copula under extreme-value dependence. A multiplier technique is used to compute approximate p-values for several candidate test statistics. Extensive Monte Carlo experiments were carried out to compare the resulting procedures with the tests of extreme-value dependence recently studied in Ben Ghorbal et al. (2009) [1] and Kojadinovic and Yan (2010) [19]. The finite-sample performance study of the tests is complemented by local power calculations.
Technometrics | 2016
Elizabeth D. Schifano; Jing Wu; Chun Wang; Jun Yan; Ming-Hui Chen
We present statistical methods for big data arising from online analytical processing, where large amounts of data arrive in streams and require fast analysis without storage/access to the historical data. In particular, we develop iterative estimating algorithms and statistical inferences for linear models and estimating equations that update as new data arrive. These algorithms are computationally efficient, minimally storage-intensive, and allow for possible rank deficiencies in the subset design matrices due to rare-event covariates. Within the linear model setting, the proposed online-updating framework leads to predictive residual tests that can be used to assess the goodness of fit of the hypothesized model. We also propose a new online-updating estimator under the estimating equation setting. Theoretical properties of the goodness-of-fit tests and proposed estimators are examined in detail. In simulation studies and real data applications, our estimator compares favorably with competing approaches under the estimating equation setting. Supplementary materials for this article are available online.
Journal of Computational and Graphical Statistics | 2009
Mary Kathryn Cowles; Jun Yan; Brian J. Smith
This article proposes a four-pronged approach to efficient Bayesian estimation and prediction for complex Bayesian hierarchical Gaussian models for spatial and spatiotemporal data. The method involves reparameterizing the covariance structure of the model, reformulating the means structure, marginalizing the joint posterior distribution, and applying a simplex-based slice sampling algorithm. The approach permits fusion of point-source data and areal data measured at different resolutions and accommodates nonspatial correlation and variance heterogeneity as well as spatial and/or temporal correlation. The method produces Markov chain Monte Carlo samplers with low autocorrelation in the output, so that fewer iterations are needed for Bayesian inference than would be the case with other sampling algorithms. Supplemental materials are available online.
Statistics and Computing | 2017
Brian Bader; Jun Yan; Xuebin Zhang
The r largest order statistics approach is widely used in extreme value analysis because it may use more information from the data than just the block maxima. In practice, the choice of r is critical. If r is too large, bias can occur; if too small, the variance of the estimator can be high. The limiting distribution of the r largest order statistics, denoted by GEV
The Annals of Applied Statistics | 2018
Brian Bader; Jun Yan; Xuebin Zhang
Canadian Journal of Statistics-revue Canadienne De Statistique | 2018
Chun Wang; Ming-Hui Chen; Jing Wu; Jun Yan; Yuping Zhang; Elizabeth D. Schifano
_r
Journal of Statistical Software | 2007
Jun Yan