Featured Researches

Methodology

Bayesian Paired-Comparison with the bpcs Package

This article introduces the bpcs R package (Bayesian Paired Comparison in Stan) and the statistical models implemented in the package. This package aims to facilitate the use of Bayesian models for paired comparison data in behavioral research. Bayesian analysis of paired comparison data allows parameter estimation even in conditions where the maximum likelihood does not exist, allows easy extension of paired comparison models, provide straightforward interpretation of the results with credible intervals, have better control of type I error, have more robust evidence towards the null hypothesis, allows propagation of uncertainties, includes prior information, and perform well when handling models with many parameters and latent variables. The bpcs package provides a consistent interface for R users and several functions to evaluate the posterior distribution of all parameters, to estimate the posterior distribution of any contest between items, and to obtain the posterior distribution of the ranks. Three reanalyses of recent studies that used the frequentist Bradley-Terry model are presented. These reanalyses are conducted with the Bayesian models of the bpcs package, and all the code used to fit the models, generate the figures, and the tables are available in the online appendix.

Read more
Methodology

Bayesian Set of Best Dynamic Treatment Regimes and Sample Size Determination for SMARTs with Binary Outcomes

One of the main goals of sequential, multiple assignment, randomized trials (SMART) is to find the most efficacious design embedded dynamic treatment regimes. The analysis method known as multiple comparisons with the best (MCB) allows comparison between dynamic treatment regimes and identification of a set of optimal regimes in the frequentist setting for continuous outcomes, thereby, directly addressing the main goal of a SMART. In this paper, we develop a Bayesian generalization to MCB for SMARTs with binary outcomes. Furthermore, we show how to choose the sample size so that the inferior embedded DTRs are screened out with a specified power. We compare log-odds between different DTRs using their exact distribution without relying on asymptotic normality in either the analysis or the power calculation. We conduct extensive simulation studies under two SMART designs and illustrate our method's application to the Adaptive Treatment for Alcohol and Cocaine Dependence (ENGAGE) trial.

Read more
Methodology

Bayesian Surrogate Analysis and Uncertainty Propagation with Explicit Surrogate Uncertainties and Implicit Spatio-temporal Correlations

We introduce Bayesian Probability Theory to investigate uncertainty propagation based on meta-models. We approach the problem from the perspective of data analysis, with a given (however almost-arbitrary) input probability distribution and a given "training" set of computer simulations. While proven mathematically to be the unique consistent probability calculus, the subject of this paper is not to demonstrate beauty but usefulness. We explicitly list all propositions and lay open the general structure of any uncertainty propagation based on meta-models. The former allows rigorous treatment at any stage, while the latter allows us to quantify the interaction of the surrogate uncertainties with the usual parameter uncertainties. Additionally, we show a simple way to implicitly include spatio-temporal correlations. We then apply the framework jointly to a family of generalized linear meta-model that implicitly includes Polynomial Chaos Expansions as a special case. While we assume a Gaussian surrogate-uncertainty, we do not assume a scale for the surrogate uncertainty to be known, i.e. a Student-t. We end up with semi-analytic formulas for surrogate uncertainties and uncertainty propagation

Read more
Methodology

Bayesian Survival Analysis Using Gamma Processes with Adaptive Time Partition

In Bayesian semi-parametric analyses of time-to-event data, non-parametric process priors are adopted for the baseline hazard function or the cumulative baseline hazard function for a given finite partition of the time axis. However, it would be controversial to suggest a general guideline to construct an optimal time partition. While a great deal of research has been done to relax the assumption of the fixed split times for other non-parametric processes, to our knowledge, no methods have been developed for a gamma process prior, which is one of the most widely used in Bayesian survival analysis. In this paper, we propose a new Bayesian framework for proportional hazards models where the cumulative baseline hazard function is modeled a priori by a gamma process. A key feature of the proposed framework is that the number and position of interval cutpoints are treated as random and estimated based on their posterior distributions.

Read more
Methodology

Bayesian Testing for Exogenous Partition Structures in Stochastic Block Models

Network data often exhibit block structures characterized by clusters of nodes with similar patterns of edge formation. When such relational data are complemented by additional information on exogenous node partitions, these sources of knowledge are typically included in the model to supervise the cluster assignment mechanism or to improve inference on edge probabilities. Although these solutions are routinely implemented, there is a lack of formal approaches to test if a given external node partition is in line with the endogenous clustering structure encoding stochastic equivalence patterns among the nodes in the network. To fill this gap, we develop a formal Bayesian testing procedure which relies on the calculation of the Bayes factor between a stochastic block model with known grouping structure defined by the exogenous node partition and an infinite relational model that allows the endogenous clustering configurations to be unknown, random and fully revealed by the block-connectivity patterns in the network. A simple Markov chain Monte Carlo method for computing the Bayes factor and quantifying uncertainty in the endogenous groups is proposed. This routine is evaluated in simulations and in an application to study exogenous equivalence structures in brain networks of Alzheimer's patients.

Read more
Methodology

Bayesian Uncertainty Quantification for Low-rank Matrix Completion

We consider the problem of uncertainty quantification for an unknown low-rank matrix X , given a partial and noisy observation of its entries. This quantification of uncertainty is essential for many real-world problems, including image processing, satellite imaging, and seismology, providing a principled framework for validating scientific conclusions and guiding decision-making. However, existing literature has largely focused on the completion (i.e., point estimation) of the matrix X , with little work on investigating its uncertainty. To this end, we propose in this work a new Bayesian modeling framework, called BayeSMG, which parametrizes the unknown X via its underlying row and column subspaces. This Bayesian subspace parametrization allows for efficient posterior inference on matrix subspaces, which represents interpretable phenomena in many applications. This can then be leveraged for improved matrix recovery. We demonstrate the effectiveness of BayeSMG over existing Bayesian matrix recovery methods in numerical experiments and a seismic sensor network application.

Read more
Methodology

Bayesian causal inference for count potential outcomes

The literature for count modeling provides useful tools to conduct causal inference when outcomes take non-negative integer values. Applied to the potential outcomes framework, we link the Bayesian causal inference literature to statistical models for count data. We discuss the general architectural considerations for constructing the predictive posterior of the missing potential outcomes. Special considerations for estimating average treatment effects are discussed, some generalizing certain relationships and some not yet encountered in the causal inference literature.

Read more
Methodology

Bayesian causal inference in probit graphical models

We consider a binary response which is potentially affected by a set of continuous variables. Of special interest is the causal effect on the response due to an intervention on a specific variable. The latter can be meaningfully determined on the basis of observational data through suitable assumptions on the data generating mechanism. In particular we assume that the joint distribution obeys the conditional independencies (Markov properties) inherent in a Directed Acyclic Graph (DAG), and the DAG is given a causal interpretation through the notion of interventional distribution. We propose a DAG-probit model where the response is generated by discretization through a random threshold of a continuous latent variable and the latter, jointly with the remaining continuous variables, has a distribution belonging to a zero-mean Gaussian model whose covariance matrix is constrained to satisfy the Markov properties of the DAG. Our model leads to a natural definition of causal effect conditionally on a given DAG. Since the DAG which generates the observations is unknown, we present an efficient MCMC algorithm whose target is the posterior distribution on the space of DAGs, the Cholesky parameters of the concentration matrix, and the threshold linking the response to the latent. Our end result is a Bayesian Model Averaging estimate of the causal effect which incorporates parameter, as well as model, uncertainty. The methodology is assessed using simulation experiments and applied to a gene expression data set originating from breast cancer stem cells.

Read more
Methodology

Bayesian estimation of trend components within Markovian regime-switching models for wholesale electricity prices: an application to the South Australian wholesale electricity market

We discuss and extend methods for estimating Markovian-Regime-Switching (MRS) and trend models for wholesale electricity prices. We argue the existing methods of trend estimation used in the electricity price modelling literature either require an ambiguous definition of an extreme price, or lead to issues when implementing model selection [23]. The first main contribution of this paper is to design and infer a model which has a model-based definition of extreme prices and permits the use of model selection criteria. Due to the complexity of the MRS models inference is not straightforward. In the existing literature an approximate EM algorithm is used [26]. Another contribution of this paper is to implement exact inference in a Bayesian setting. This also allows the use of posterior predictive checks to assess model fit. We demonstrate the methodologies with South Australian electricity market.

Read more
Methodology

Bayesian hierarchical stacking: Some models are (somewhere) useful

Stacking is a widely used model averaging technique that asymptotically yields optimal predictions among linear averages. We show that stacking is most effective when model predictive performance is heterogeneous in inputs, and we can further improve the stacked mixture with a hierarchical model. We generalize stacking to Bayesian hierarchical stacking. The model weights are varying as a function of data, partially-pooled, and inferred using Bayesian inference. We further incorporate discrete and continuous inputs, other structured priors, and time series and longitudinal data. To verify the performance gain of the proposed method, we derive theory bounds, and demonstrate on several applied problems.

Read more

Ready to get started?

Join us today