Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dimitris Fouskakis is active.

Publication


Featured researches published by Dimitris Fouskakis.


Journal of Global Optimization | 2000

A Case Study of Stochastic Optimization in Health Policy: Problem Formulation and Preliminary Results

David Draper; Dimitris Fouskakis

We use Bayesian decision theory to address a variable selection problem arising in attempts to indirectly measure the quality of hospital care, by comparing observed mortality rates to expected values based on patient sickness at admission. Our method weighs data collection costs against predictive accuracy to find an optimal subset of the available admission sickness variables. The approach involves maximizing expected utility across possible subsets, using Monte Carlo methods based on random division of the available data into N modeling and validation splits to approximate the expectation. After exploring the geometry of the solution space, we compare a variety of stochastic optimization methods –- including genetic algorithms (GA), simulated annealing (SA), tabu search (TS), threshold acceptance (TA), and messy simulated annealing (MSA) –- on their performance in finding good subsets of variables, and we clarify the role of N in the optimization. Preliminary results indicate that TS is somewhat better than TA and SA in this problem, with MSA and GA well behind the other three methods. Sensitivity analysis reveals broad stability of our conclusions.


Journal of the American Statistical Association | 2008

Comparing Stochastic Optimization Methods for Variable Selection in Binary Outcome Prediction, With Application to Health Policy

Dimitris Fouskakis; David Draper

Traditional variable-selection strategies in generalized linear models (GLMs) seek to optimize a measure of predictive accuracy without regard for the cost of data collection. When the purpose of such model building is the creation of predictive scales to be used in future studies with constrained budgets, the standard approach may not be optimal. We propose a Bayesian decision-theoretic framework for variable selection in binary-outcome GLMs where the budget for data collection is constrained and potential predictors may vary considerably in cost. The method is illustrated using data from a large study of quality of hospital care in the U.S. in the 1980s. Especially when the number of available predictors p is large, it is important to use an appropriate technique for optimization (e.g., in an application presented here where p = 83, the space over which we search has 283 ≐ 1025 elements, which is too large to explore using brute force enumeration). Specifically, we investigate simulated annealing (SA), genetic algorithms (GAs), and the tabu search (TS) method used in operations research, and we develop a context-specific version of SA, improved simulated annealing (ISA), that performs better than its generic counterpart. When p was modest in our study, we found that GAs performed relatively poorly for all but the very best user-defined input configurations, generic SA did not perform well, and TS had excellent median performance and was much less sensitive to suboptimal choice of user-defined inputs. When p was large in our study, the best versions of GA and ISA outperformed TS and generic SA. Our results are presented in the context of health policy but can apply to other quality assessment settings with dichotomous outcomes as well.


Bayesian Analysis | 2015

Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models

Dimitris Fouskakis; Ioannis Ntzoufras; David Draper

In the context of the expected-posterior prior (EPP) approach to Bayesian variable selection in linear models, we combine ideas from power-prior and unit-information-prior methodologies to simultaneously produce a minimally-informative prior and diminish the effect of training samples. The result is that in practice our power-expected-posterior (PEP) methodology is sufficiently insensitive to the size n* of the training sample, due to PEPs unit-information construction, that one may take n* equal to the full-data sample size n and dispense with training samples altogether. In this paper we focus on Gaussian linear models and develop our method under two different baseline prior choices: the independence Jeffreys (or reference) prior, yielding the J-PEP posterior, and the Zellner g-prior, leading to Z-PEP. We find that, under the reference baseline prior, the asymptotics of PEP Bayes factors are equivalent to those of Schwartzs BIC criterion, ensuring consistency of the PEP approach to model selection. We compare the performance of our method, in simulation studies and a real example involving prediction of air-pollutant concentrations from meteorological covariates, with that of a variety of previously-defined variants on Bayes factors for objective variable selection. Our prior, due to its unit-information structure, leads to a variable-selection procedure that (1) is systematically more parsimonious than the basic EPP with minimal training sample, while sacrificing no desirable performance characteristics to achieve this parsimony; (2) is robust to the size of the training sample, thus enjoying the advantages described above arising from the avoidance of training samples altogether; and (3) identifies maximum-a-posteriori models that achieve good out-of-sample predictive performance.


Computational Statistics & Data Analysis | 2009

Importance partitioning in micro-aggregation

George Kokolakis; Dimitris Fouskakis

One of the techniques of data holders for the protection of confidentiality of continuous data is that of micro-aggregation. Rather than releasing raw data (individual records), micro-aggregation releases the averages of small groups and thus reduces the risk of identity disclosure. At the same time the method implies loss of information and often distorts the data. Thus, the choice of groups is very crucial to minimize the information loss and the data distortion. No exact polynomial algorithms exist up to date for optimal micro-aggregation, and so the usage of heuristic methods is necessary. A heuristic algorithm, based on the notion of importance partitioning, is proposed and it is shown that compared with other micro-aggregation heuristics achieves improved performance.


Journal of Computational and Graphical Statistics | 2016

Power-Conditional-Expected Priors: Using g-Priors With Random Imaginary Data for Variable Selection

Dimitris Fouskakis; Ioannis Ntzoufras

The Zellners g-prior and its recent hierarchical extensions are the most popular default prior choices in the Bayesian variable selection context. These prior setups can be expressed as power-priors with fixed set of imaginary data. In this article, we borrow ideas from the power-expected-posterior (PEP) priors to introduce, under the g-prior approach, an extra hierarchical level that accounts for the imaginary data uncertainty. For normal regression variable selection problems, the resulting power-conditional-expected-posterior (PCEP) prior is a conjugate normal-inverse gamma prior that provides a consistent variable selection procedure and gives support to more parsimonious models than the ones supported using the g-prior and the hyper-g prior for finite samples. Detailed illustrations and comparisons of the variable selection procedures using the proposed method, the g-prior, and the hyper-g prior are provided using both simulated and real data examples. Supplementary materials for this article are available online.


Statistics and Computing | 2013

Computation for intrinsic variable selection in normal regression models via expected-posterior prior

Dimitris Fouskakis; Ioannis Ntzoufras

In this paper, we focus on the variable selection problem in normal regression models using the expected-posterior prior methodology. We provide a straightforward MCMC scheme for the derivation of the posterior distribution, as well as Monte Carlo estimates for the computation of the marginal likelihood and posterior model probabilities. Additionally, for large spaces, a model search algorithm based on


Computational Statistics & Data Analysis | 2006

Bregman divergences in the (m×k)-partitioning problem

George Kokolakis; Ph. Nanopoulos; Dimitris Fouskakis

\mathit{MC}^{3}


Brazilian Journal of Probability and Statistics | 2016

Limiting behavior of the Jeffreys power-expected-posterior Bayes factor in Gaussian linear models

Dimitris Fouskakis; Ioannis Ntzoufras

is constructed. The proposed methodology is applied in two real life examples, already used in the relevant literature of objective variable selection. In both examples, uncertainty over different training samples is taken into consideration.


Archive | 2015

Bayesian Variable Selection for Generalized Linear Models Using the Power-Conditional-Expected-Posterior Prior

Konstantinos Perrakis; Dimitris Fouskakis; Ioannis Ntzoufras

A method of fixed cardinality partition is examined. This methodology can be applied on many problems, such as the confidentiality protection, in which the protection of confidential information has to be ensured, while preserving the information content of the data. The basic feature of the technique is to aggregate the data into m groups of small fixed size k, by minimizing Bregman divergences. It is shown that, in the case of non-uniform probability measures the groups of the optimal solution are not necessarily separated by hyperplanes, while with uniform they are. After the creation of an initial partition on a real data-set, an algorithm, based on two different Bregman divergences, is proposed and applied. This methodology provides us with a very fast and efficient tool to construct a near-optimum partition for the (mxk)-partitioning problem.


Statistics and Computing | 2018

Objective Bayesian transformation and variable selection using default Bayes factors

Efstratia Charitidou; Dimitris Fouskakis; Ioannis Ntzoufras

Expected-posterior priors (EPP) have been proved to be extremely useful for testing hypothesis on the regression coefficients of normal linear models. One of the advantages of using EPPs is that impropriety of baseline priors causes no indeterminacy. However, in regression problems, they based on one or more \textit{training samples}, that could influence the resulting posterior distribution. The power-expected-posterior priors are minimally-informative priors that diminishing the effect of training samples on the EPP approach, by combining ideas from the power-prior and unit-information-prior methodologies. In this paper we show the consistency of the Bayes factors when using the power-expected-posterior priors, with the independence Jeffreys (or reference) prior as a baseline, for normal linear models under very mild conditions on the design matrix.

Collaboration


Dive into the Dimitris Fouskakis's collaboration.

Top Co-Authors

Avatar

Ioannis Ntzoufras

Athens University of Economics and Business

View shared research outputs
Top Co-Authors

Avatar

David Draper

University of California

View shared research outputs
Top Co-Authors

Avatar

Konstantinos Perrakis

Athens University of Economics and Business

View shared research outputs
Top Co-Authors

Avatar

Efstratia Charitidou

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar

George Kokolakis

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Brunero Liseo

Sapienza University of Rome

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge