Juho Piironen
Helsinki Institute for Information Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Juho Piironen.
Statistics and Computing | 2017
Juho Piironen; Aki Vehtari
The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.
Electronic Journal of Statistics | 2017
Juho Piironen; Aki Vehtari
The horseshoe prior has proven to be a noteworthy alternative for sparse Bayesian estimation, but has previously suffered from two problems. First, there has been no systematic way of specifying a prior for the global shrinkage hyperparameter based on the prior information about the degree of sparsity in the parameter vector. Second, the horseshoe prior has the undesired property that there is no possibility of specifying separately information about sparsity and the amount of regularization for the largest coefficients, which can be problematic with weakly identified parameters, such as the logistic regression coefficients in the case of data separation. This paper proposes solutions to both of these problems. We introduce a concept of effective number of nonzero parameters, show an intuitive way of formulating the prior for the global hyperparameter based on the sparsity assumptions, and argue that the previous default choices are dubious based on their tendency to favor solutions with more unshrunk parameters than we typically expect a priori. Moreover, we introduce a generalization to the horseshoe prior, called the regularized horseshoe, that allows us to specify a minimum level of regularization to the largest values. We show that the new prior can be considered as the continuous counterpart of the spike-and-slab prior with a finite slab width, whereas the original horseshoe resembles the spike-and-slab with an infinitely wide slab. Numerical experiments on synthetic and real world data illustrate the benefit of both of these theoretical advances.
international workshop on machine learning for signal processing | 2016
Juho Piironen; Aki Vehtari
We propose a new method for simplification of Gaussian process (GP) models by projecting the information contained in the full encompassing model and selecting a reduced number of variables based on their predictive relevance. Our results on synthetic and real world datasets show that the proposed method improves the assessment of variable relevance compared to the automatic relevance determination (ARD) via the length-scale parameters. We expect the method to be useful for improving explainability of the models, reducing the future measurement costs and reducing the computation time for making new predictions.
international conference on artificial intelligence and statistics | 2017
Juho Piironen; Aki Vehtari
arXiv: Methodology | 2015
Juho Piironen; Aki Vehtari
arXiv: Methodology | 2016
Eero Siivola; Juho Piironen; Aki Vehtari
international conference on telecommunications | 2018
Simon Holmbacka; Jarno Niemela; Henri Karikallio; Karri Sunila; Ilkka Raiskinen; Eero Siivola; Juho Piironen; Tuomas Sivula
international conference on artificial intelligence and statistics | 2018
Juho Piironen; Aki Vehtari
arXiv: Methodology | 2018
Topi Paananen; Juho Piironen; Michael Riis Andersen; Aki Vehtari
arXiv: Machine Learning | 2018
Juho Piironen; Markus Paasiniemi; Aki Vehtari