Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Guillaume Bouchard is active.

Publication


Featured researches published by Guillaume Bouchard.


Molecular Biology and Evolution | 2013

Testing for Associations between Loci and Environmental Gradients Using Latent Factor Mixed Models

Eric Frichot; Sean D. Schoville; Guillaume Bouchard; Olivier François

Adaptation to local environments often occurs through natural selection acting on a large number of loci, each having a weak phenotypic effect. One way to detect these loci is to identify genetic polymorphisms that exhibit high correlation with environmental variables used as proxies for ecological pressures. Here, we propose new algorithms based on population genetics, ecological modeling, and statistical learning techniques to screen genomes for signatures of local adaptation. Implemented in the computer program “latent factor mixed model” (LFMM), these algorithms employ an approach in which population structure is introduced using unobserved variables. These fast and computationally efficient algorithms detect correlations between environmental and genetic variation while simultaneously inferring background levels of population structure. Comparing these new algorithms with related methods provides evidence that LFMM can efficiently estimate random effects due to population history and isolation-by-distance patterns when computing gene-environment correlations, and decrease the number of false-positive associations in genome scans. We then apply these models to plant and human genetic data, identifying several genes with functions related to development that exhibit strong correlations with climatic gradients.


Genetics | 2014

Fast and Efficient Estimation of Individual Ancestry Coefficients

Eric Frichot; François Mathieu; Théo Trouillon; Guillaume Bouchard; Olivier François

Inference of individual ancestry coefficients, which is important for population genetic and association studies, is commonly performed using computer-intensive likelihood algorithms. With the availability of large population genomic data sets, fast versions of likelihood algorithms have attracted considerable attention. Reducing the computational burden of estimation algorithms remains, however, a major challenge. Here, we present a fast and efficient method for estimating individual ancestry coefficients based on sparse nonnegative matrix factorization algorithms. We implemented our method in the computer program sNMF and applied it to human and plant data sets. The performances of sNMF were then compared to the likelihood algorithm implemented in the computer program ADMIXTURE. Without loss of accuracy, sNMF computed estimates of ancestry coefficients with runtimes ∼10–30 times shorter than those of ADMIXTURE.


Government Information Quarterly | 2012

Opinion mining in social media: Modeling, simulating, and forecasting political opinions in the web

Pawel Sobkowicz; Michael Kaschesky; Guillaume Bouchard

Abstract Affordable and ubiquitous online communications (social media) provide the means for flows of ideas and opinions and play an increasing role for the transformation and cohesion of society – yet little is understood about how online opinions emerge, diffuse, and gain momentum. To address this problem, an opinion formation framework based on content analysis of social media and sociophysical system modeling is proposed. Based on prior research and own projects, three building blocks of online opinion tracking and simulation are described: (1) automated topic, emotion and opinion detection in real-time, (2) information flow modeling and agent-based simulation, and (3) modeling of opinion networks, including special social and psychological circumstances, such as the influence of emotions, media and leaders, changing social networks etc. Finally, three application scenarios are presented to illustrate the framework and motivate further research.


web search and data mining | 2013

Connecting comments and tags: improved modeling of social tagging systems

Dawei Yin; Shengbo Guo; Boris Chidlovskii; Brian D. Davison; Cédric Archambeau; Guillaume Bouchard

Collaborative tagging systems are now deployed extensively to help users share and organize resources. Tag prediction and recommendation can simplify and streamline the user experience, and by modeling user preferences, predictive accuracy can be significantly improved. However, previous methods typically model user behavior based only on a log of prior tags, neglecting other behaviors and information in social tagging systems, e.g., commenting on items and connecting with other users. On the other hand, little is known about the connection and correlations among these behaviors and contexts in social tagging systems. In this paper, we investigate improved modeling for predictive social tagging systems. Our explanatory analyses demonstrate three significant challenges: coupled high order interaction, data sparsity and cold start on items. We tackle these problems by using a generalized latent factor model and fully Bayesian treatment. To evaluate performance, we test on two real-world data sets from Flickr and Bibsonomy. Our experiments on these data sets show that to achieve best predictive performance, it is necessary to employ a fully Bayesian treatment in modeling high order relations in social tagging system. Our methods noticeably outperform state-of-the-art approaches.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2006

Selection of generative models in classification

Guillaume Bouchard; Gilles Celeux

This paper is concerned with the selection of a generative model for supervised classification. Classical criteria for model selection assess the fit of a model rather than its ability to produce a low classification error rate. A new criterion, the Bayesian entropy criterion (BEC), is proposed. This criterion takes into account the decisional purpose of a model by minimizing the integrated classification entropy. It provides an interesting alternative to the cross-validated error rate which is computationally expensive. The asymptotic behavior of the BEC criterion is presented. Numerical experiments on both simulated and real data sets show that BEC performs better than the BIC criterion to select a model minimizing the classification error rate and provides analogous performance to the cross-validated error rate.


international conference on machine learning and applications | 2007

Bias-variance tradeoff in hybrid generative-discriminative models

Guillaume Bouchard

Given any generative classifier based on an inexact density model, we can define a discriminative counterpart that reduces its asymptotic error rate, while increasing the estimation variance. An optimal bias-variance balance might be found using hybrid generative-discriminative (HGD) approaches. In these paper, these methods are defined in a unified framework. This allow us to find sufficient conditions under which an improvement in generalization performances is guaranteed. Numerical experiments illustrate the well fondness of our statements.


international conference on machine learning | 2009

Split variational inference

Guillaume Bouchard; Onno Zoeter

We propose a deterministic method to evaluate the integral of a positive function based on soft-binning functions that smoothly cut the integral into smaller integrals that are easier to approximate. In combination with mean-field approximations for each individual sub-part this leads to a tractable algorithm that alternates between the optimization of the bins and the approximation of the local integrals. We introduce suitable choices for the binning functions such that a standard mean field approximation can be extended to a split mean field approximation without the need for extra derivations. The method can be seen as a revival of the ideas underlying the mixture mean field approach. The latter can be obtained as a special case by taking soft-max functions for the binning.


Frontiers in Genetics | 2012

Correcting Principal Component Maps for Effects of Spatial Autocorrelation in Population Genetic Data

Eric Frichot; Sean D. Schoville; Guillaume Bouchard; Olivier François

In many species, spatial genetic variation displays patterns of “isolation-by-distance.” Characterized by locally correlated allele frequencies, these patterns are known to create periodic shapes in geographic maps of principal components which confound signatures of specific migration events and influence interpretations of principal component analyses (PCA). In this study, we introduced models combining probabilistic PCA and kriging models to infer population genetic structure from genetic data while correcting for effects generated by spatial autocorrelation. The corresponding algorithms are based on singular value decomposition and low rank approximation of the genotypic data. As their complexity is close to that of PCA, these algorithms scale with the dimensions of the data. To illustrate the utility of these new models, we simulated isolation-by-distance patterns and broad-scale geographic variation using spatial coalescent models. Our methods remove the horseshoe patterns usually observed in PC maps and simplify interpretations of spatial genetic variation. We demonstrate our approach by analyzing single nucleotide polymorphism data from the Human Genome Diversity Panel, and provide comparisons with other recently introduced methods.


international conference on digital government research | 2011

Opinion mining in social media: modeling, simulating, and visualizing political opinion formation in the web

Michael Kaschesky; Pawel Sobkowicz; Guillaume Bouchard

Affordable and ubiquitous online communications (social media) provide the means for flows of ideas and opinions and play an increasing role for the transformation and cohesion of society - yet little is understood about how online opinions emerge, diffuse, and gain momentum. To address this problem, an opinion formation framework based on content analysis of social media and sociophysical system modeling is proposed. Based on prior research and own projects, three building blocks of online opinion tracking and simulation are described: (1) automated topic and opinion detection in real-time, (2) topic and opinion modeling and agent-based simulation, and (3) visualizations of topic and opinion networks. Finally, two application scenarios are presented to illustrate the framework and motivate further research.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

Latent IBP Compound Dirichlet Allocation

Cédric Archambeau; Balaji Lakshminarayanan; Guillaume Bouchard

We introduce the four-parameter IBP compound Dirichlet process (ICDP), a stochastic process that generates sparse non-negative vectors with potentially an unbounded number of entries. If we repeatedly sample from the ICDP we can generate sparse matrices with an infinite number of columns and power-law characteristics. We apply the four-parameter ICDP to sparse nonparametric topic modelling to account for the very large number of topics present in large text corpora and the power-law distribution of the vocabulary of natural languages. The model, which we call latent IBP compound Dirichlet allocation (LIDA), allows for power-law distributions, both, in the number of topics summarising the documents and in the number of words defining each topic. It can be interpreted as a sparse variant of the hierarchical Pitman-Yor process when applied to topic modelling. We derive an efficient and simple collapsed Gibbs sampler closely related to the collapsed Gibbs sampler of latent Dirichlet allocation (LDA), making the model applicable in a wide range of domains. Our nonparametric Bayesian topic model compares favourably to the widely used hierarchical Dirichlet process and its heavy tailed version, the hierarchical Pitman-Yor process, on benchmark corpora. Experiments demonstrate that accounting for the power-distribution of real data is beneficial and that sparsity provides more interpretable results.

Collaboration


Dive into the Guillaume Bouchard's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Eric Frichot

Centre national de la recherche scientifique

View shared research outputs
Researchain Logo
Decentralizing Knowledge