Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mattias Villani is active.

Publication


Featured researches published by Mattias Villani.


Econometric Theory | 2005

Bayesian Reference Analysis of Cointegration

Mattias Villani

A Bayesian reference analysis of the cointegrated vector autoregression is presented based on a new prior distribution. Among other properties, it is shown that this prior distribution distributes its probability mass uniformly over all cointegration spaces for a given cointegration rank and is invariant to the choice of normalizing variables for the cointegration vectors. Several methods for computing the posterior distribution of the number of cointegrating relations and distribution of the model parameters for a given number of relations are proposed, including an efficient Gibbs sampling approach where all inferences are determined from the same posterior sample. Simulated data are used to illustrate the procedures and for discussing the well-known issue of local nonidentification.The author thanks Luc Bauwens, Anant Kshirsagar, Peter Phillips, Herman van Dijk, four anonymous referees, and especially Daniel Thorburn for helpful comments. Financial support from the Swedish Council of Research in Humanities and Social Sciences (HSFR) grant F0582/1999 and Swedish Research Council (VetenskapsrA¥det) grant 412-2002-1007 is gratefully acknowledged. The views expressed in this paper are solely the responsibility of the author and should not be interpreted as reflecting the views of the Executive Board of Sveriges Riksbank.


Frontiers in Neuroinformatics | 2014

BROCCOLI: Software for fast fMRI analysis on many-core CPUs and GPUs

Anders Eklund; Paul A. Dufort; Mattias Villani; Stephen M. LaConte

Analysis of functional magnetic resonance imaging (fMRI) data is becoming ever more computationally demanding as temporal and spatial resolutions improve, and large, publicly available data sets proliferate. Moreover, methodological improvements in the neuroimaging pipeline, such as non-linear spatial normalization, non-parametric permutation tests and Bayesian Markov Chain Monte Carlo approaches, can dramatically increase the computational burden. Despite these challenges, there do not yet exist any fMRI software packages which leverage inexpensive and powerful graphics processing units (GPUs) to perform these analyses. Here, we therefore present BROCCOLI, a free software package written in OpenCL (Open Computing Language) that can be used for parallel analysis of fMRI data on a large variety of hardware configurations. BROCCOLI has, for example, been tested with an Intel CPU, an Nvidia GPU, and an AMD GPU. These tests show that parallel processing of fMRI data can lead to significantly faster analysis pipelines. This speedup can be achieved on relatively standard hardware, but further, dramatic speed improvements require only a modest investment in GPU hardware. BROCCOLI (running on a GPU) can perform non-linear spatial normalization to a 1 mm3 brain template in 4–6 s, and run a second level permutation test with 10,000 permutations in about a minute. These non-parametric tests are generally more robust than their parametric counterparts, and can also enable more sophisticated analyses by estimating complicated null distributions. Additionally, BROCCOLI includes support for Bayesian first-level fMRI analysis using a Gibbs sampler. The new software is freely available under GNU GPL3 and can be downloaded from github (https://github.com/wanderine/BROCCOLI/).


Journal of Computational and Graphical Statistics | 2012

Regression Density Estimation With Variational Methods and Stochastic Approximation

David J. Nott; Siew Li Tan; Mattias Villani; Robert Kohn

Regression density estimation is the problem of flexibly estimating a response distribution as a function of covariates. An important approach to regression density estimation uses finite mixture models and our article considers flexible mixtures of heteroscedastic regression (MHR) models where the response distribution is a normal mixture, with the component means, variances, and mixture weights all varying as a function of covariates. Our article develops fast variational approximation (VA) methods for inference. Our motivation is that alternative computationally intensive Markov chain Monte Carlo (MCMC) methods for fitting mixture models are difficult to apply when it is desired to fit models repeatedly in exploratory analysis and model choice. Our article makes three contributions. First, a VA for MHR models is described where the variational lower bound is in closed form. Second, the basic approximation can be improved by using stochastic approximation (SA) methods to perturb the initial solution to attain higher accuracy. Third, the advantages of our approach for model choice and evaluation compared with MCMC-based approaches are illustrated. These advantages are particularly compelling for time series data where repeated refitting for one-step-ahead prediction in model choice and diagnostics and in rolling-window computations is very common. Supplementary materials for the article are available online.


Journal of the American Statistical Association | 2018

Speeding up MCMC by Efficient Data Subsampling

Matias Quiroz; Mattias Villani; Robert Kohn

The computing time for Markov Chain Monte Carlo (MCMC) algorithms can be prohibitively large for datasets with many observations, especially when the data density for each observation is costly to evaluate. We propose a framework where the likelihood function is estimated from a random subset of the data, resulting in substantially fewer density evaluations. The data subsets are selected using an efficient Probability Proportional-to-Size (PPS) sampling scheme, where the inclusion probability of an observation is proportional to an approximation of its contribution to the log-likelihood function. Three broad classes of approximations are presented. The proposed algorithm is shown to sample from a distribution that is within O(m−½) of the true posterior, where m is the subsample size. Moreover, the constant in the O(m−½) error bound of the likelihood is shown to be small and the approximation error is demonstrated to be negligible even for a small m in our applications.We propose a simple way to adaptively choose the sample size m during the MCMC to optimize sampling efficiency for a fixed computational budget. The method is applied to a bivariate probit model on a data set with half a million observations, and on a Weibull regression model with random effects for discrete-time survival data.


Journal of Financial and Quantitative Analysis | 2014

Taking the Twists into Account: Predicting Firm Bankruptcy Risk with Splines of Financial Ratios

Paolo Giordani; Tor Jacobson; Erik L. von Schedvin; Mattias Villani

We demonstrate improvements in predictive power when introducing spline functions to take account of highly non-linear relationships between firm failure and earnings, leverage, and liquidity in a logistic bankruptcy model. Our results show that modeling excessive non-linearities yields substantially improved bankruptcy predictions, on the order of 70 to 90 percent, compared with a standard logistic model. The spline model provides several important and surprising insights into non-monotonic bankruptcy relationships. We find that low-leveraged and highly profitable firms are riskier than given by a standard model. These features are remarkably stable over time, suggesting that they are of a structural nature.


NeuroImage | 2017

Fast Bayesian whole-brain fMRI analysis with spatial 3D priors

Per Sidén; Anders Eklund; David Bolin; Mattias Villani

Abstract Spatial whole‐brain Bayesian modeling of task‐related functional magnetic resonance imaging (fMRI) is a great computational challenge. Most of the currently proposed methods therefore do inference in subregions of the brain separately or do approximate inference without comparison to the true posterior distribution. A popular such method, which is now the standard method for Bayesian single subject analysis in the SPM software, is introduced in Penny et al. (2005b). The method processes the data slice‐by‐slice and uses an approximate variational Bayes (VB) estimation algorithm that enforces posterior independence between activity coefficients in different voxels. We introduce a fast and practical Markov chain Monte Carlo (MCMC) scheme for exact inference in the same model, both slice‐wise and for the whole brain using a 3D prior on activity coefficients. The algorithm exploits sparsity and uses modern techniques for efficient sampling from high‐dimensional Gaussian distributions, leading to speed‐ups without which MCMC would not be a practical option. Using MCMC, we are for the first time able to evaluate the approximate VB posterior against the exact MCMC posterior, and show that VB can lead to spurious activation. In addition, we develop an improved VB method that drops the assumption of independent voxels a posteriori. This algorithm is shown to be much faster than both MCMC and the original VB for large datasets, with negligible error compared to the MCMC posterior. HighlightsA fast method for Bayesian inference in task‐fMRI with spatial 3D priors is proposed.Sparse techniques for high‐dimensional Gaussian sampling give great speed‐ups.Using exact inference shows that SPMs variational Bayes can lead to false activity.An improved variational Bayesian method shows increased speed and accuracy.


Journal of Computational and Graphical Statistics | 2018

Sparse Partially Collapsed MCMC for Parallel Inference in Topic Models

Måns Magnusson; Leif Jonsson; Mattias Villani; David Broman

ABSTRACT Topic models, and more specifically the class of latent Dirichlet allocation (LDA), are widely used for probabilistic modeling of text. Markov chain Monte Carlo (MCMC) sampling from the posterior distribution is typically performed using a collapsed Gibbs sampler. We propose a parallel sparse partially collapsed Gibbs sampler and compare its speed and efficiency to state-of-the-art samplers for topic models on five well-known text corpora of differing sizes and properties. In particular, we propose and compare two different strategies for sampling the parameter block with latent topic indicators. The experiments show that the increase in statistical inefficiency from only partial collapsing is smaller than commonly assumed, and can be more than compensated by the speedup from parallelization and sparsity on larger corpora. We also prove that the partially collapsed samplers scale well with the size of the corpus. The proposed algorithm is fast, efficient, exact, and can be used in more modeling situations than the ordinary collapsed sampler. Supplementary materials for this article are available online.


NeuroImage | 2017

A Bayesian heteroscedastic GLM with application to fMRI data with motion spikes

Anders Eklund; Martin A. Lindquist; Mattias Villani

ABSTRACT We propose a voxel‐wise general linear model with autoregressive noise and heteroscedastic noise innovations (GLMH) for analyzing functional magnetic resonance imaging (fMRI) data. The model is analyzed from a Bayesian perspective and has the benefit of automatically down‐weighting time points close to motion spikes in a data‐driven manner. We develop a highly efficient Markov Chain Monte Carlo (MCMC) algorithm that allows for Bayesian variable selection among the regressors to model both the mean (i.e., the design matrix) and variance. This makes it possible to include a broad range of explanatory variables in both the mean and variance (e.g., time trends, activation stimuli, head motion parameters and their temporal derivatives), and to compute the posterior probability of inclusion from the MCMC output. Variable selection is also applied to the lags in the autoregressive noise process, making it possible to infer the lag order from the data simultaneously with all other model parameters. We use both simulated data and real fMRI data from OpenfMRI to illustrate the importance of proper modeling of heteroscedasticity in fMRI data analysis. Our results show that the GLMH tends to detect more brain activity, compared to its homoscedastic counterpart, by allowing the variance to change over time depending on the degree of head motion. HIGHLIGHTSA Bayesian heteroscedastic model for analyzing fMRI data is proposed.The heteroscedastic model allows for the noise variance to change over time.Time points corresponding to motion spikes are automatically downweighted.Variable selection is used for parameters modelling the mean, variance and AR noise.More brain activity can be detected, without using any censoring or scrubbing.


Journal of Computational and Graphical Statistics | 2018

Efficient Covariance Approximations for Large Sparse Precision Matrices

Per Sidén; Finn Lindgren; David Bolin; Mattias Villani

ABSTRACT The use of sparse precision (inverse covariance) matrices has become popular because they allow for efficient algorithms for joint inference in high-dimensional models. Many applications require the computation of certain elements of the covariance matrix, such as the marginal variances, which may be nontrivial to obtain when the dimension is large. This article introduces a fast Rao–Blackwellized Monte Carlo sampling-based method for efficiently approximating selected elements of the covariance matrix. The variance and confidence bounds of the approximations can be precisely estimated without additional computational costs. Furthermore, a method that iterates over subdomains is introduced, and is shown to additionally reduce the approximation errors to practically negligible levels in an application on functional magnetic resonance imaging data. Both methods have low memory requirements, which is typically the bottleneck for competing direct methods.


Scandinavian Journal of Statistics | 2013

Efficient Bayesian Multivariate Surface Regression.

Feng Li; Mattias Villani

Methods for choosing a fixed set of knot locations in additive spline models are fairly well established in the statistical literature. The curse of dimensionality makes it nontrivial to extend these methods to nonadditive surface models, especially when there are more than a couple of covariates. We propose a multivariate Gaussian surface regression model that combines both additive splines and interactive splines, and a highly efficient Markov chain Monte Carlo algorithm that updates all the knot locations jointly. We use shrinkage prior to avoid overfitting with different estimated shrinkage factors for the additive and surface part of the model, and also different shrinkage parameters for the different response variables. Simulated data and an application to firm leverage data show that the approach is computationally efficient, and that allowing for freely estimated knot locations can offer a substantial improvement in out-of-sample predictive performance.

Collaboration


Dive into the Mattias Villani's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Robert Kohn

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David Broman

Royal Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zebo Peng

Linköping University

View shared research outputs
Researchain Logo
Decentralizing Knowledge