Statistics Computation - Researchain | Decentralizing Knowledge

Featured Researches

A Kalman particle filter for online parameter estimation with applications to affine models

In this paper we address the problem of estimating the posterior distribution of the static parameters of a continuous time state space model with discrete time observations by an algorithm that combines the Kalman filter and a particle filter. The proposed algorithm is semi-recursive and has a two layer structure, in which the outer layer provides the estimation of the posterior distribution of the unknown parameters and the inner layer provides the estimation of the posterior distribution of the state variables. This algorithm has a similar structure as the so-called recursive nested particle filter, but unlike the latter filter, in which both layers use a particle filter, this proposed algorithm introduces a dynamic kernel to sample the parameter particles in the outer layer to obtain a higher convergence speed. Moreover, this algorithm also implements the Kalman filter in the inner layer to reduce the computational time. This algorithm can also be used to estimate the parameters that suddenly change value. We prove that, for a state space model with a certain structure, the estimated posterior distribution of the unknown parameters and the state variables converge to the actual distribution in L p with rate of order O( N − 1 2 + δ 1 2 ) , where N is the number of particles for the parameters in the outer layer and δ is the maximum time step between two consecutive observations. We present numerical results of the implementation of this algorithm, in particularly we implement this algorithm for affine interest models, possibly with stochastic volatility, although the algorithm can be applied to a much broader class of models.

Computation

A Koopman framework for rare event simulation in stochastic differential equations

We exploit the relationship between the stochastic Koopman operator and the Kolmogorov backward equation to construct importance sampling schemes for stochastic differential equations. Specifically, we propose using eigenfunctions of the stochastic Koopman operator to approximate the Doob transform for an observable of interest (e.g., associated with a rare event) which in turn yields an approximation of the corresponding zero-variance importance sampling estimator. Our approach is broadly applicable and systematic, treating non-normal systems, non-gradient systems, and systems with oscillatory dynamics or rank-deficient noise in a common framework. In nonlinear settings where the stochastic Koopman eigenfunctions cannot be derived analytically, we use dynamic mode decomposition (DMD) methods to compute them numerically, but the framework is agnostic to the particular numerical method employed. Numerical experiments demonstrate that even coarse approximations of a few eigenfunctions, where the latter are built from non-rare trajectories, can produce effective importance sampling schemes for rare events.

Computation

A Latent Slice Sampling Algorithm

In this paper we introduce a new sampling algorithm which has the potential to be adopted as a universal replacement to the Metropolis--Hastings algorithm. It is related to the slice sampler, and motivated by an algorithm which is applicable to discrete probability distributions %which can be viewed as an alternative to the Metropolis--Hastings algorithm in this setting, which obviates the need for a proposal distribution, in that is has no accept/reject component. This paper looks at the continuous counterpart. A latent variable combined with a slice sampler and a shrinkage procedure applied to uniform density functions creates a highly efficient sampler which can generate random variables from very high dimensional distributions as a single block.

Computation

A Low Rank Gaussian Process Prediction Model for Very Large Datasets

Spatial prediction requires expensive computation to invert the spatial covariance matrix it depends on and also has considerable storage needs. This work concentrates on computationally efficient algorithms for prediction using very large datasets. A recent prediction model for spatial data known as Fixed Rank Kriging is much faster than the kriging and can be easily implemented with less assumptions about the process. However, Fixed Rank Kriging requires the estimation of a matrix which must be positive definite and the original estimation procedure cannot guarantee this property. We present a result that shows when a matrix subtraction of a given form will give a positive definite matrix. Motivated by this result, we present an iterative Fixed Rank Kriging algorithm that ensures positive definiteness of the matrix required for prediction and show that under mild conditions the algorithm numerically converges. The modified Fixed Rank Kriging procedure is implemented to predict missing chlorophyll observations for very large regions of ocean color. Predictions are compared to those made by other well known methods of spatial prediction.

Computation

A Maximum Entropy Procedure to Solve Likelihood Equations

In this article we provide initial findings regarding the problem of solving likelihood equations by means of a maximum entropy approach. Unlike standard procedures that require equating at zero the score function of the maximum-likelihood problem, we propose an alternative strategy where the score is instead used as external informative constraint to the maximization of the convex Shannon's entropy function. The problem involves the re-parameterization of the score parameters as expected values of discrete probability distributions where probabilities need to be estimated. This leads to a simpler situation where parameters are searched in smaller (hyper) simplex space. We assessed our proposal by means of empirical case studies and a simulation study, this latter involving the most critical case of logistic regression under data separation. The results suggested that the maximum entropy re-formulation of the score problem solves the likelihood equation problem. Similarly, when maximum-likelihood estimation is difficult, as for the case of logistic regression under separation, the maximum entropy proposal achieved results (numerically) comparable to those obtained by the Firth's Bias-corrected approach. Overall, these first findings reveal that a maximum entropy solution can be considered as an alternative technique to solve the likelihood equation.

Computation

A New Estimation Algorithm for Box-Cox Transformation Cure Rate Model and Comparison With EM Algorithm

In this paper, we develop a new estimation procedure based on the non-linear conjugate gradient (NCG) algorithm for the Box-Cox transformation cure rate model. We compare the performance of the NCG algorithm with the well-known expectation maximization (EM) algorithm through a simulation study and show the advantages of the NCG algorithm over the EM algorithm. In particular, we show that the NCG algorithm allows simultaneous maximization of all model parameters when the likelihood surface is flat with respect to a Box-Cox model parameter. This is a big advantage over the EM algorithm, where a profile likelihood approach has been proposed in the literature that may not provide satisfactory results. We finally use the NCG algorithm to analyze a well-known melanoma data and show that it results in a better fit.

Computation

A Note on Particle Gibbs Method and its Extensions and Variants

High-dimensional state trajectories of state-space models pose challenges for Bayesian inference. Particle Gibbs (PG) methods have been widely used to sample from the posterior of a state space model. Basically, particle Gibbs is a Particle Markov Chain Monte Carlo (PMCMC) algorithm that mimics the Gibbs sampler by drawing model parameters and states from their conditional distributions. This tutorial provides an introductory view on Particle Gibbs (PG) method and its extensions and variants, and illustrates through several examples of inference in non-linear state space models (SSMs). We also implement PG Samplers in two different programming languages: Python and Rust. Comparison of run-time performance of Python and Rust programs are also provided for various PG methods.

Computation

A Parallel Evolutionary Multiple-Try Metropolis Markov Chain Monte Carlo Algorithm for Sampling Spatial Partitions

We develop an Evolutionary Markov Chain Monte Carlo (EMCMC) algorithm for sampling spatial partitions that lie within a large and complex spatial state space. Our algorithm combines the advantages of evolutionary algorithms (EAs) as optimization heuristics for state space traversal and the theoretical convergence properties of Markov Chain Monte Carlo algorithms for sampling from unknown distributions. Local optimality information that is identified via a directed search by our optimization heuristic is used to adaptively update a Markov chain in a promising direction within the framework of a Multiple-Try Metropolis Markov Chain model that incorporates a generalized Metropolis-Hasting ratio. We further expand the reach of our EMCMC algorithm by harnessing the computational power afforded by massively parallel architecture through the integration of a parallel EA framework that guides Markov chains running in parallel.

Computation

A Picture's Worth a Thousand Words: Visualizing n-dimensional Overlap in Logistic Regression Models with Empirical Likelihood

In this note, conditions for the existence and uniqueness of the maximum likelihood estimate for multidimensional predictor, binary response models are introduced from a sensitivity testing point of view. The well known condition of Silvapulle is translated to be an empirical likelihood maximization which, with existing R code, mechanizes the process of assessing overlap status. The translation shifts the meaning of overlap, defined by geometrical properties of the two-predictor groups, from the intersection of their convex cones is non-empty to the more understandable requirement that the convex hull of their differences contains zero. The code is applied to reveal the character of overlap by examining minimal overlapping structures and cataloging them in dimensions fewer than four. Rules to generate minimal higher dimensional structures which account for overlap are provided. Supplementary materials are available online.

Computation

A Python Library For Empirical Calibration

Dealing with biased data samples is a common task across many statistical fields. In survey sampling, bias often occurs due to unrepresentative samples. In causal studies with observational data, the treated versus untreated group assignment is often correlated with covariates, i.e., not random. Empirical calibration is a generic weighting method that presents a unified view on correcting or reducing the data biases for the tasks mentioned above. We provide a Python library EC to compute the empirical calibration weights. The problem is formulated as convex optimization and solved efficiently in the dual form. Compared to existing software, EC is both more efficient and robust. EC also accommodates different optimization objectives, supports weight clipping, and allows inexact calibration, which improves usability. We demonstrate its usage across various experiments with both simulated and real-world data.

Ready to get started?

Join us today

Archive Your Research