Aixin Tan
University of Iowa
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Aixin Tan.
Physical Review D | 2012
X. Qian; Aixin Tan; Wen-Wen Wang; J. J. Ling; R. D. McKeown; C. Zhang
Statistical methods of presenting experimental results in constraining the neutrino mass hierarchy (MH) are discussed. Two problems are considered and are related to each other: how to report the findings for observed experimental data, and how to evaluate the ability of a future experiment to determine the neutrino mass hierarchy, namely, sensitivity of the experiment. For the first problem where experimental data have already been observed, the classical statistical analysis involves constructing confidence intervals for the parameter Δm^2_(32). These intervals are deduced from the parent distribution of the estimation of Δm^2_(32) based on experimental data. Due to existing experimental constraints on |Δm^2_(32)|, the estimation of Δm^2_(32) is better approximated by a Bernoulli distribution (a Binomial distribution with 1 trial) rather than a Gaussian distribution. Therefore, the Feldman- Cousins approach needs to be used instead of the Gaussian approximation in constructing confidence intervals. Furthermore, as a result of the definition of confidence intervals, even if it is correctly constructed, its confidence level does not directly reflect how much one hypothesis of the MH is supported by the data rather than the other hypothesis. We thus describe a Bayesian approach that quantifies the evidence provided by the observed experimental data through the (posterior) probability that either one hypothesis of MH is true. This Bayesian presentation of observed experimental results is then used to develop several metrics to assess the sensitivity of future experiments. Illustrations are made using a simple example with a confined parameter space, which approximates the MH determination problem with experimental constraints on the |Δm^2_(32)|.
Journal of Computational and Graphical Statistics | 2009
Aixin Tan; James P. Hobert
Bayesian versions of the classical one-way random effects model are widely used to analyze data. If the standard diffuse prior is adopted, there is a simple block Gibbs sampler that can be employed to explore the intractable posterior distribution. In this article, theoretical and methodological results are developed that allow one to use this block Gibbs sampler with the same level of confidence that one would have using classical (iid) Monte Carlo. Indeed, a regenerative simulation method is developed that yields simple, asymptotically valid standard errors for the ergodic averages that are used to estimate intractable posterior expectations. These standard errors can be used to choose an appropriate (Markov chain) Monte Carlo sample size. The regenerative method rests on the assumption that the underlying Markov chain converges to its stationary distribution at a geometric rate. Another contribution of this article is a result showing that, unless the dataset is extremely small and unbalanced, the block Gibbs Markov chain is geometrically ergodic. We illustrate the use of the regenerative method with data from a styrene exposure study. R code for the simulation is posted as an online supplement.
Genetics | 2008
Chang-Xing Ma; Qibin Yu; Arthur Berg; Derek R. Drost; Evandro Novaes; Guifang Fu; John Stephen Yap; Aixin Tan; Matias Kirst; Yuehua Cui; Rongling Wu
The differences of a phenotypic trait produced by a genotype in response to changes in the environment are referred to as phenotypic plasticity. Despite its importance in the maintenance of genetic diversity via genotype-by-environment interactions, little is known about the detailed genetic architecture of this phenomenon, thus limiting our ability to predict the pattern and process of microevolutionary responses to changing environments. In this article, we develop a statistical model for mapping quantitative trait loci (QTL) that control the phenotypic plasticity of a complex trait through differentiated expressions of pleiotropic QTL in different environments. In particular, our model focuses on count traits that represent an important aspect of biological systems, controlled by a network of multiple genes and environmental factors. The model was derived within a multivariate mixture model framework in which QTL genotype-specific mixture components are modeled by a multivariate Poisson distribution for a count trait expressed in multiple clonal replicates. A two-stage hierarchic EM algorithm is implemented to obtain the maximum-likelihood estimates of the Poisson parameters that specify environment-specific genetic effects of a QTL and residual errors. By approximating the number of sylleptic branches on the main stems of poplar hybrids by a Poisson distribution, the new model was applied to map QTL that contribute to the phenotypic plasticity of a count trait. The statistical behavior of the model and its utilization were investigated through simulation studies that mimic the poplar example used. This model will provide insights into how genomes and environments interact to determine the phenotypes of complex count traits.
Journal of The Royal Statistical Society Series B-statistical Methodology | 2014
Hani Doss; Aixin Tan
In the classical biased sampling problem, we have k densities π1(·), …, πk (·), each known up to a normalizing constant, i.e. for l = 1, …, k, πl (·) = νl (·)/ml , where νl (·) is a known function and ml is an unknown constant. For each l, we have an iid sample from πl ,·and the problem is to estimate the ratios ml/ms for all l and all s. This problem arises frequently in several situations in both frequentist and Bayesian inference. An estimate of the ratios was developed and studied by Vardi and his co-workers over two decades ago, and there has been much subsequent work on this problem from many different perspectives. In spite of this, there are no rigorous results in the literature on how to estimate the standard error of the estimate. We present a class of estimates of the ratios of normalizing constants that are appropriate for the case where the samples from the πl s are not necessarily iid sequences, but are Markov chains. We also develop an approach based on regenerative simulation for obtaining standard errors for the estimates of ratios of normalizing constants. These standard error estimates are valid for both the iid case and the Markov chain case.
Bernoulli | 2007
James P. Hobert; Aixin Tan; Ruitao Liu
Consider a parametric statistical model P(dx\0) and an improper prior distribution v(d0) that together yield a (proper) formal posterior distribution Q(d6\x). The prior is called strongly admissible if the generalized Bayes estimator of every bounded function of 0 is admissible under squared error loss. Eaton [Ann. Sta tist. 20 (1992) 1147-1179] has shown that a sufficient condition for strong admissibility of v is the local recurrence of the Markov chain whose transition function is R(0, dij) = f Q(dr)\x)P(dx\6). Applications of this result and its extensions are often greatly simplified when the Markov chain associated with R is ir reducible. However, establishing irreducibility can be difficult. In this paper, we provide a characterization of irreducibility for general state space Markov chains and use this characterization to develop an easily checked, necessary and sufficient condition for irreducibility of Eatons Markov chain. All that is required to check this condition is a simple examination of P and v. Application of the main result is illustrated using two examples.
Journal of Computational and Graphical Statistics | 2015
Aixin Tan; Hani Doss; James P. Hobert
Importance sampling is a classical Monte Carlo technique in which a random sample from one probability density, π1, is used to estimate an expectation with respect to another, π. The importance sampling estimator is strongly consistent and, as long as two simple moment conditions are satisfied, it obeys a central limit theorem (CLT). Moreover, there is a simple consistent estimator for the asymptotic variance in the CLT, which makes for routine computation of standard errors. Importance sampling can also be used in the Markov chain Monte Carlo (MCMC) context. Indeed, if the random sample from π1 is replaced by a Harris ergodic Markov chain with invariant density π1, then the resulting estimator remains strongly consistent. There is a price to be paid, however, as the computation of standard errors becomes more complicated. First, the two simple moment conditions that guarantee a CLT in the iid case are not enough in the MCMC context. Second, even when a CLT does hold, the asymptotic variance has a complex form and is difficult to estimate consistently. In this article, we explain how to use regenerative simulation to overcome these problems. Actually, we consider a more general setup, where we assume that Markov chain samples from several probability densities, π1, …, πk, are available. We construct multiple-chain importance sampling estimators for which we obtain a CLT based on regeneration. We show that if the Markov chains converge to their respective target distributions at a geometric rate, then under moment conditions similar to those required in the iid case, the MCMC-based importance sampling estimator obeys a CLT. Furthermore, because the CLT is based on a regenerative process, there is a simple consistent estimator of the asymptotic variance. We illustrate the method with two applications in Bayesian sensitivity analysis. The first concerns one-way random effect models under different priors. The second involves Bayesian variable selection in linear regression, and for this application, importance sampling based on multiple chains enables an empirical Bayes approach to variable selection.
arXiv: Statistics Theory | 2013
Aixin Tan; Galin L. Jones; James P. Hobert
A Markov chain is geometrically ergodic if it converges to its in- variant distribution at a geometric rate in total variation norm. We study geo- metric ergodicity of deterministic and random scan versions of the two-variable Gibbs sampler. We give a sufficient condition which simultaneously guarantees both versions are geometrically ergodic. We also develop a method for simul- taneously establishing that both versions are subgeometrically ergodic. These general results allow us to characterize the convergence rate of two-variable Gibbs samplers in a particular family of discrete bivariate distributions.
Nuclear Instruments & Methods in Physics Research Section A-accelerators Spectrometers Detectors and Associated Equipment | 2016
X. Qian; Aixin Tan; J. J. Ling; Y. Nakajima; C. Zhang
We describe a method based on the CLs approach to present results in searches of new physics under the condition that the relevant parameter space is continuous. We focus on the CLs approach that relies on test statistics developed for non-nested hypotheses testing problems. We demonstrate that a Gaussian approximation to the distribution of these test statistics can be made when the sample size is large, and such an approximation leads to a simple procedure of forming exclusion sets for the parameters of interest, which we call the Gaussian CLs method. The CLs approach is a different way to present statistical results than the traditional approach of setting confidence intervals (CI). They have different objectives in that, the traditional CI approach identifies an allowable range of parameter values of a model, such that parameter values outside this range has small chance of generating data that are similar to the observed data; whereas the CLs approach excludes a set of models that are much worse than the reference model (usually the Standard Model) in fitting the observed data. In practice, a most popular method to set CIs is by applying a simple Chi-square thresholding rule, which is valid under relatively stringent conditions. When these conditions are not satisfied, the computationally intensive Monte Carlo simulations is generally needed to set CIs correctly. In comparison, the CLs approach can be easily carried out with the Gaussian CLs under fairly mild conditions, which are usually satisfied in the problem of searching for new physics through precision measurements. This work provides a self-contained mathematical proof for the Gaussian CLs method, while explicitly outlining the required conditions. Also, we illustrate the Gaussian CLs method by deriving exclusion sets in the two-dimensional parameter space of (sin 2 2�,|�m 2 |) in searching for a sterile neutrino, where CLs approach was rarely used before. Finally, despite the different objectives of the CLs approach and the CI approach, results obtained from the Gaussian CLs method are compared to that of various CI methods on an example.
Computational Statistics & Data Analysis | 2015
Joyee Ghosh; Aixin Tan
Abstract Markov chain Monte Carlo (MCMC) algorithms have greatly facilitated the popularity of Bayesian variable selection and model averaging in problems with high-dimensional covariates where enumeration of the model space is infeasible. A variety of such algorithms have been proposed in the literature for sampling models from the posterior distribution in Bayesian variable selection. Ghosh and Clyde proposed a method to exploit the properties of orthogonal design matrices. Their data augmentation algorithm scales up the computation tremendously compared to traditional Gibbs samplers, and leads to the availability of Rao-Blackwellized estimates of quantities of interest for the original non-orthogonal problem. The algorithm has excellent performance when the correlations among the columns of the design matrix are small, but empirical results suggest that moderate to strong multicollinearity leads to slow mixing. This motivates the need to develop a class of novel sandwich algorithms for Bayesian variable selection that improves upon the algorithm of Ghosh and Clyde. It is proved that the Haar algorithm with the largest group that acts on the space of models is the optimum algorithm, within the parameter expansion data augmentation (PXDA) class of sandwich algorithms. The result provides theoretical insight but using the largest group is computationally prohibitive so two new computationally viable sandwich algorithms are developed, which are inspired by the Haar algorithm, but do not necessarily belong to the class of PXDA algorithms. It is illustrated via simulation studies and real data analysis that several of the sandwich algorithms can offer substantial gains in the presence of multicollinearity.
Statistica Sinica | 2018
Vivekananda Roy; Aixin Tan; James M. Flegal