Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jim E. Griffin is active.

Publication


Featured researches published by Jim E. Griffin.


Journal of the American Statistical Association | 2006

Order-Based Dependent Dirichlet Processes

Jim E. Griffin; Mark F. J. Steel

In this article we propose a new framework for Bayesian nonparametric modeling with continuous covariates. In particular, we allow the nonparametric distribution to depend on covariates through ordering the random variables building the weights in the stick-breaking representation. We focus mostly on the class of random distributions that induces a Dirichlet process at each covariate value. We derive the correlation between distributions at different covariate values and use a point process to implement a practically useful type of ordering. Two main constructions with analytically known correlation structures are proposed. Practical and efficient computational methods are introduced. We apply our framework, through mixtures of these processes, to regression modeling, the modeling of stochastic volatility in time series data, and spatial geostatistical modeling.


Bayesian Analysis | 2010

Inference with normal-gamma prior distributions in regression problems

Jim E. Griffin; Philip J. Brown

This paper considers the efiects of placing an absolutely continuous prior distribution on the regression coe-cients of a linear model. We show that the posterior expectation is a matrix-shrunken version of the least squares estimate where the shrinkage matrix depends on the derivatives of the prior predictive den- sity of the least squares estimate. The special case of the normal-gamma prior, which generalizes the Bayesian Lasso (Park and Casella 2008), is studied in depth. We discuss the prior interpretation and the posterior efiects of hyperparameter choice and suggest a data-dependent default prior. Simulations and a chemomet- ric example are used to compare the performance of the normal-gamma and the Bayesian Lasso in terms of out-of-sample predictive performance.


Bioinformatics | 2012

Bayesian correlated clustering to integrate multiple datasets

Paul Kirk; Jim E. Griffin; Richard S. Savage; Zoubin Ghahramani; David L. Wild

Motivation: The integration of multiple datasets remains a key challenge in systems biology and genomic medicine. Modern high-throughput technologies generate a broad array of different data types, providing distinct—but often complementary—information. We present a Bayesian method for the unsupervised integrative modelling of multiple datasets, which we refer to as MDI (Multiple Dataset Integration). MDI can integrate information from a wide range of different datasets and data types simultaneously (including the ability to model time series data explicitly using Gaussian processes). Each dataset is modelled using a Dirichlet-multinomial allocation (DMA) mixture model, with dependencies between these models captured through parameters that describe the agreement among the datasets. Results: Using a set of six artificially constructed time series datasets, we show that MDI is able to integrate a significant number of datasets simultaneously, and that it successfully captures the underlying structural similarity between the datasets. We also analyse a variety of real Saccharomyces cerevisiae datasets. In the two-dataset case, we show that MDI’s performance is comparable with the present state-of-the-art. We then move beyond the capabilities of current approaches and integrate gene expression, chromatin immunoprecipitation–chip and protein–protein interaction data, to identify a set of protein complexes for which genes are co-regulated during the cell cycle. Comparisons to other unsupervised data integration techniques—as well as to non-integrative approaches—demonstrate that MDI is competitive, while also providing information that would be difficult or impossible to extract using other methods. Availability: A Matlab implementation of MDI is available from http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Econometric Reviews | 2008

Sampling Returns for Realized Variance Calculations: Tick Time or Transaction Time?

Jim E. Griffin; Roel C. A. Oomen

This article introduces a new model for transaction prices in the presence of market microstructure noise in order to study the properties of the price process on two different time scales, namely, transaction time where prices are sampled with every transaction and tick time where prices are sampled with every price change. Both sampling schemes have been used in the literature on realized variance, but a formal investigation into their properties has been lacking. Our empirical and theoretical results indicate that the return dynamics in transaction time are very different from those in tick time and the choice of sampling scheme can therefore have an important impact on the properties of realized variance. For RV we find that tick time sampling is superior to transaction time sampling in terms of mean-squared-error, especially when the level of noise, number of ticks, or the arrival frequency of efficient price moves is low. Importantly, we show that while the microstructure noise may appear close to IID in transaction time, in tick time it is highly dependent. As a result, bias correction procedures that rely on the noise being independent, can fail in tick time and are better implemented in transaction time.


Journal of Computational and Graphical Statistics | 2011

Posterior Simulation of Normalized Random Measure Mixtures

Jim E. Griffin; Stephen G. Walker

This article describes posterior simulation methods for mixture models whose mixing distribution has a Normalized Random Measure prior. The methods use slice sampling ideas and introduce no truncation error. The approach can be easily applied to both homogeneous and nonhomogeneous Normalized Random Measures and allows the updating of the parameters of the random measure. The methods are illustrated on data examples using both Dirichlet and Normalized Generalized Gamma process priors. In particular, the methods are shown to be computationally competitive with previously developed samplers for Dirichlet process mixture models. Matlab code to implement these methods is available as supplemental material.


Bayesian Analysis | 2015

Two-sample Bayesian nonparametric hypothesis testing

Christopher Holmes; Francois Caron; Jim E. Griffin; David A. Stephens

In this article we describe Bayesian nonparametric procedures for two-sample hypothesis testing. Namely, given two sets of samples y^{(1)} iid F^{(1)} and y^{(2)} iid F^{(2)}, with F^{(1)}, F^{(2)} unknown, we wish to evaluate the evidence for the null hypothesis H_{0}:F^{(1)} = F^{(2)} versus the alternative. Our method is based upon a nonparametric Polya tree prior centered either subjectively or using an empirical procedure. We show that the Polya tree prior leads to an analytic expression for the marginal likelihood under the two hypotheses and hence an explicit measure of the probability of the null Pr(H_{0}|y^{(1)},y^{(2)}).


Bioinformatics | 2010

Discovering transcriptional modules by Bayesian data integration

Richard S. Savage; Zoubin Ghahramani; Jim E. Griffin; Bernard J. de la Cruz; David L. Wild

Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs. Availability: If interested in the code for the work presented in this article, please contact the authors. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Breast Cancer Research | 2014

Timing of pubertal stages and breast cancer risk: the Breakthrough Generations Study

Danielle H. Bodicoat; Minouk J. Schoemaker; Michael E. Jones; Emily McFadden; Jim E. Griffin; Alan Ashworth; Anthony J. Swerdlow

IntroductionBreast development and hormonal changes at puberty might affect breast cancer risk, but epidemiological analyses have focussed largely on age at menarche and not at other pubertal stages.MethodsWe investigated associations between the timing of pubertal stages and breast cancer risk using data from a cohort study of 104,931 women (Breakthrough Generations Study, UK, 2003–2013). Pubertal variables were reported retrospectively at baseline. Breast cancer risk was analysed using Cox regression models with breast cancer diagnosis as the outcome of interest, attained age as the underlying time variable, and adjustment for potentially confounding variables.ResultsDuring follow-up (mean = 4.1 years), 1094 breast cancers (including ductal carcinoma in situ) occurred. An increased breast cancer risk was associated with earlier thelarche (age when breast growth begins; HR [95% CI] = 1.23 [1.02, 1.48], 1 [referent] and 0.80 [0.69, 0.93] for ≤10, 11–12 and ≥13 years respectively), menarche (initiation of menses; 1.06 [0.93, 1.21], 1 [referent] and 0.78 [0.62, 0.99] for ≤12, 13–14 and ≥15 years), regular periods (0.99 [0.83, 1.18], 1 [referent] and 0.74 [0.59, 0.92] for ≤12, 13–14 and ≥15 years) and age reached adult height (1.25 [1.03, 1.52], 1 [referent] and 1.07 [0.87, 1.32] for ≤14, 15–16 and ≥17 years), and with increased time between thelarche and menarche (0.87 [0.65, 1.15], 1 [referent], 1.14 [0.96, 1.34] and 1.27 [1.04, 1.55] for <0, 0, 1 and ≥2 years), and shorter time between menarche and regular periods (1 [referent], 0.87 [0.73, 1.04] and 0.66 [0.50, 0.88] for 0, 1 and ≥2 years). These associations were generally similar when considered separately for premenopausal and postmenopausal breast cancer.ConclusionsBreast duct development may be a time of heightened susceptibility to risk of carcinogenesis, and greater attention needs to be given to the relation of breast cancer risk to the different stages of puberty.


Journal of Computational and Graphical Statistics | 2009

Transdimensional Sampling Algorithms for Bayesian Variable Selection in Classification Problems With Many More Variables Than Observations

Demetris Lamnisos; Jim E. Griffin; Mark F. J. Steel

Model search in probit regression is often conducted by simultaneously exploring the model and parameter space, using a reversible jump MCMC sampler. Standard samplers often have low model acceptance probabilities when there are many more regressors than observations. Implementing recent suggestions in the literature leads to much higher acceptance rates. However, high acceptance rates are often associated with poor mixing of chains. Thus, we design a more general model proposal that allows us to propose models “further” from our current model. This proposal can be tuned to achieve a suitable acceptance rate for good mixing. The effectiveness of this proposal is linked to the form of the marginalization scheme when updating the model and we propose a new efficient implementation of the automatic generic transdimensional algorithm of Green (2003). We also implement other previously proposed samplers and compare the efficiency of all methods on some gene expression datasets. Finally, the results of these applications lead us to propose guidelines for choosing between samplers. Relevant code and datasets are posted as an online supplement.


Bayesian Analysis | 2011

Bayesian Nonparametric Modelling of the Return Distribution with Stochastic Volatility

Eleni-Ioanna Delatola; Jim E. Griffin

This paper presents a method for Bayesian nonparametric analysis of the return distribution in a stochastic volatility model. The distribution of the logarithm of the squared return is flexibly modelled using an infinite mixture of Normal distributions. This allows efficient Markov chain Monte Carlo methods to be developed. Links between the return distribution and the distribution of the logarithm of the squared returns are discussed. The method is applied to simulated data, two asset return series and two stock index return series. We find that estimates of volatility using the model can differ dramatically from those using a Normal return distribution if there is evidence of a heavy-tailed return distribution.

Collaboration


Dive into the Jim E. Griffin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Demetris Lamnisos

Cyprus University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Maria Kalli

Canterbury Christ Church University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge