Statistics Computation - Researchain | Decentralizing Knowledge

Featured Researches

An invitation to sequential Monte Carlo samplers

Sequential Monte Carlo samplers provide consistent approximations of sequences of probability distributions and of their normalizing constants, via particles obtained with a combination of importance weights and Markov transitions. This article presents this class of methods and a number of recent advances, with the goal of helping statisticians assess the applicability and usefulness of these methods for their purposes. Our presentation emphasizes the role of bridging distributions for computational and statistical purposes. Numerical experiments are provided on simple settings such as multivariate Normals, logistic regression and a basic susceptible-infected-recovered model, illustrating the impact of the dimension, the ability to perform inference sequentially and the estimation of normalizing constants.

Computation

Analytic Evaluation of the Fractional Moments for the Quasi-Stationary Distribution of the Shiryaev Martingale on an Interval

We consider the quasi-stationary distribution of the classical Shiryaev diffusion restricted to the interval [0,A] with absorption at a fixed A>0 . We derive analytically a closed-form formula for the distribution's fractional moment of an {\em arbitrary} given order s∈R ; the formula is consistent with that previously found by Polunchenko and Pepelyshev (2018) for the case of s∈N . We also show by virtue of the formula that, if s<1 , then the s -th fractional moment of the quasi-stationary distribution becomes that of the exponential distribution (with mean 1/2 ) in the limit as A→+∞ ; the limiting exponential distribution is the stationary distribution of the reciprocal of the Shiryaev diffusion.

Computation

Analyzing Basket Trials under Multisource Exchangeability Assumptions

Basket designs are prospective clinical trials that are devised with the hypothesis that the presence of selected molecular features determine a patient's subsequent response to a particular "targeted" treatment strategy. Basket trials are designed to enroll multiple clinical subpopulations to which it is assumed that the therapy in question offers beneficial efficacy in the presence of the targeted molecular profile. The treatment, however, may not offer acceptable efficacy to all subpopulations enrolled. Moreover, for rare disease settings, such as oncology wherein these trials have become popular, marginal measures of statistical evidence are difficult to interpret for sparsely enrolled subpopulations. Consequently, basket trials pose challenges to the traditional paradigm for trial design, which assumes inter-patient exchangeability. The R-package \pkg{basket} facilitates the analysis of basket trials by implementing multi-source exchangeability models. By evaluating all possible pairwise exchangeability relationships, this hierarchical modeling framework facilitates Bayesian posterior shrinkage among a collection of discrete and pre-specified subpopulations. Analysis functions are provided to implement posterior inference of the response rates and all possible exchangeability relationships between subpopulations. In addition, the package can identify "poolable" subsets of and report their response characteristics. The functionality of the package is demonstrated using data from an oncology study with subpopulations defined by tumor histology.

Computation

Analyzing Commodity Futures Using Factor State-Space Models with Wishart Stochastic Volatility

We propose a factor state-space approach with stochastic volatility to model and forecast the term structure of future contracts on commodities. Our approach builds upon the dynamic 3-factor Nelson-Siegel model and its 4-factor Svensson extension and assumes for the latent level, slope and curvature factors a Gaussian vector autoregression with a multivariate Wishart stochastic volatility process. Exploiting the conjugacy of the Wishart and the Gaussian distribution, we develop a computationally fast and easy to implement MCMC algorithm for the Bayesian posterior analysis. An empirical application to daily prices for contracts on crude oil with stipulated delivery dates ranging from one to 24 months ahead show that the estimated 4-factor Svensson model with two curvature factors provides a good parsimonious representation of the serial correlation in the individual prices and their volatility. It also shows that this model has a good out-of-sample forecast performance.

Computation

Analyzing MCMC Output

Markov chain Monte Carlo (MCMC) is a sampling-based method for estimating features of probability distributions. MCMC methods produce a serially correlated, yet representative, sample from the desired distribution. As such it can be difficult to know when the MCMC method is producing reliable results. We introduce some fundamental methods for ensuring a trustworthy simulation experiment. In particular, we present a workflow for output analysis in MCMC providing estimators, approximate sampling distributions, stopping rules, and visualization tools.

Computation

Anytime Parallel Tempering

Developing efficient MCMC algorithms is indispensable in Bayesian inference. In parallel tempering, multiple interacting MCMC chains run to more efficiently explore the state space and improve performance. The multiple chains advance independently through local moves, and the performance enhancement steps are exchange moves, where the chains pause to exchange their current sample amongst each other. To accelerate the independent local moves, they may be performed simultaneously on multiple processors. Another problem is then encountered: depending on the MCMC implementation and inference problem, local moves can take a varying and random amount of time to complete. There may also be infrastructure-induced variations, such as competing jobs on the same processors, which arises in cloud computing. Before exchanges can occur, all chains must complete the local moves they are engaged in to avoid introducing a potentially substantial bias (Proposition 2.1). To solve this issue of randomly varying local move completion times in multi-processor parallel tempering, we adopt the Anytime Monte Carlo framework of Murray et al. (2016): we impose real-time deadlines on the parallel local moves and perform exchanges at these deadlines without any processor idling. We show our methodology for exchanges at real-time deadlines does not introduce a bias and leads to significant performance enhancements over the naïve approach of idling until every processor's local moves complete. The methodology is then applied in an ABC setting, where an Anytime ABC parallel tempering algorithm is derived for the difficult task of estimating the parameters of a Lotka-Volterra predator-prey model, and similar efficiency enhancements are observed.

Computation

Application of the interacting particle system method to piecewise deterministic Markov processes used in reliability

Variance reduction methods are often needed for the reliability assessment of complex industrial systems, we focus on one variance reduction method in a given context, that is the interacting particle system method (IPS) used on piecewise deterministic Markov processes (PDMP) for reliability assessment . The PDMPs are a very large class of processes which benefit from high modeling capacities, they can model almost any Markovian phenomenon that does not include diffusion. In reliability assessment, the PDMPs modeling industrial systems generally involve low jump rates and jump kernels favoring one safe arrival, we call such model a "concentrated PDMP". Used on such concentrated PDMPs, the IPS is inefficient and does not always provide a variance reduction. Indeed, the efficiency of the IPS method relies on simulating many different trajectories during its propagation steps, but unfortunately concentrated PDMPs are likely to generate the same deterministic trajectories over and over. We propose an adaptation of the IPS method called IPS+M that reduces this phenomenon. The IPS+M consists in modifying the propagation steps of the IPS, by conditioning the propagation to avoid generating the same trajectories multiple times. We prove that, compared to the IPS, the IPS+M method always provides an estimator with a lower variance. We also carry out a quick simulation study on a two-components system that confirms this result.

Computation

Applications of Quantum Annealing in Statistics

Quantum computation offers exciting new possibilities for statistics. This paper explores the use of the D-Wave machine, a specialized type of quantum computer, which performs quantum annealing. A general description of quantum annealing through the use of the D-Wave is given, along with technical issues to be encountered. Quantum annealing is used to perform maximum likelihood estimation, generate an experimental design, and perform matrix inversion. Though the results show that quantum computing is still at an early stage which is not yet superior to classical computation, there is promise for quantum computation in the future.

Computation

Approximate Bayesian Computations to fit and compare insurance loss models

Approximate Bayesian Computation (ABC) is a statistical learning technique to calibrate and select models by comparing observed data to simulated data. This technique bypasses the use of the likelihood and requires only the ability to generate synthetic data from the models of interest. We apply ABC to fit and compare insurance loss models using aggregated data. A state-of-the-art ABC implementation in Python is proposed. It uses sequential Monte Carlo to sample from the posterior distribution and the Wasserstein distance to compare the observed and synthetic data.

Computation

Approximate Bayesian Inference via Sparse grid Quadrature Evaluation for Hierarchical Models

We combine conditioning techniques with sparse grid quadrature rules to develop a computationally efficient method to approximate marginal, but not necessarily univariate, posterior quantities, yielding approximate Bayesian inference via Sparse grid Quadrature Evaluation (BISQuE) for hierarchical models. BISQuE reformulates posterior quantities as weighted integrals of conditional quantities, such as densities and expectations. Sparse grid quadrature rules allow computationally efficient approximation of high dimensional integrals, which appear in hierarchical models with many hyperparameters. BISQuE reduces computational effort relative to standard, Markov chain Monte Carlo methods by at least two orders of magnitude on several applied and illustrative models. We also briefly discuss using BISQuE to apply Integrated Nested Laplace Approximations (INLA) to models with more hyperparameters than is currently practical.

Ready to get started?

Join us today

Archive Your Research