Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Simon Boitard is active.

Publication


Featured researches published by Simon Boitard.


PLOS Biology | 2012

Genome-Wide Analysis of the World's Sheep Breeds Reveals High Levels of Historic Mixture and Strong Recent Selection

James W. Kijas; Johannes A. Lenstra; Ben J. Hayes; Simon Boitard; Laercio R. Porto Neto; Magali San Cristobal; Bertrand Servin; Russell McCulloch; Vicki Whan; Kimberly Gietzen; Samuel Rezende Paiva; W. Barendse; E. Ciani; Herman W. Raadsma; J. C. McEwan; Brian P. Dalrymple

Genomic structure in a global collection of domesticated sheep reveals a history of artificial selection for horn loss and traits relating to pigmentation, reproduction, and body size.


BMC Bioinformatics | 2011

Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems

Kim-Anh Lê Cao; Simon Boitard; Philippe Besse

BackgroundVariable selection on high throughput biological data, such as gene expression or single nucleotide polymorphisms (SNPs), becomes inevitable to select relevant information and, therefore, to better characterize diseases or assess genetic structure. There are different ways to perform variable selection in large data sets. Statistical tests are commonly used to identify differentially expressed features for explanatory purposes, whereas Machine Learning wrapper approaches can be used for predictive purposes. In the case of multiple highly correlated variables, another option is to use multivariate exploratory approaches to give more insight into cell biology, biological pathways or complex traits.ResultsA simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework.ConclusionssPLS-DA has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets. More importantly, sPLS-DA is clearly competitive in terms of computational efficiency and superior in terms of interpretability of the results via valuable graphical outputs. sPLS-DA is available in the R package mixOmics, which is dedicated to the analysis of large biological data sets.


Genetics | 2010

Detecting Selection in Population Trees: The Lewontin and Krakauer Test Extended

Maxime Bonhomme; Claude Chevalet; Bertrand Servin; Simon Boitard; Jihad Abdallah; Sarah Blott; Magali SanCristobal

Detecting genetic signatures of selection is of great interest for many research issues. Common approaches to separate selective from neutral processes focus on the variance of FST across loci, as does the original Lewontin and Krakauer (LK) test. Modern developments aim to minimize the false positive rate and to increase the power, by accounting for complex demographic structures. Another stimulating goal is to develop straightforward parametric and computationally tractable tests to deal with massive SNP data sets. Here, we propose an extension of the original LK statistic (TLK), named TF–LK, that uses a phylogenetic estimation of the populations kinship (\batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} \(\mathrm{\mathcal{F}}\) \end{document}) matrix, thus accounting for historical branching and heterogeneity of genetic drift. Using forward simulations of single-nucleotide polymorphisms (SNPs) data under neutrality and selection, we confirm the relative robustness of the LK statistic (TLK) to complex demographic history but we show that TF–LK is more powerful in most cases. This new statistic outperforms also a multinomial-Dirichlet-based model [estimation with Markov chain Monte Carlo (MCMC)], when historical branching occurs. Overall, TF–LK detects 15–35% more selected SNPs than TLK for low type I errors (P < 0.001). Also, simulations show that TLK and TF–LK follow a chi-square distribution provided the ancestral allele frequencies are not too extreme, suggesting the possible use of the chi-square distribution for evaluating significance. The empirical distribution of TF–LK can be derived using simulations conditioned on the estimated \batchmode \documentclass[fleqn,10pt,legalpaper]{article} \usepackage{amssymb} \usepackage{amsfonts} \usepackage{amsmath} \pagestyle{empty} \begin{document} \(\mathrm{\mathcal{F}}\) \end{document} matrix. We apply this new test to pig breeds SNP data and pinpoint outliers using TF–LK, otherwise undetected using the less powerful TLK statistic. This new test represents one solution for compromise between advanced SNP genetic data acquisition and outlier analyses.


Genetics | 2013

Detecting Signatures of Selection Through Haplotype Differentiation Among Hierarchically Structured Populations

María Inés Fariello; Simon Boitard; Hugo Naya; Magali SanCristobal; Bertrand Servin

The detection of molecular signatures of selection is one of the major concerns of modern population genetics. A widely used strategy in this context is to compare samples from several populations and to look for genomic regions with outstanding genetic differentiation between these populations. Genetic differentiation is generally based on allele frequency differences between populations, which are measured by FST or related statistics. Here we introduce a new statistic, denoted hapFLK, which focuses instead on the differences of haplotype frequencies between populations. In contrast to most existing statistics, hapFLK accounts for the hierarchical structure of the sampled populations. Using computer simulations, we show that each of these two features—the use of haplotype information and of the hierarchical structure of populations—significantly improves the detection power of selected loci and that combining them in the hapFLK statistic provides even greater power. We also show that hapFLK is robust with respect to bottlenecks and migration and improves over existing approaches in many situations. Finally, we apply hapFLK to a set of six sheep breeds from Northern Europe and identify seven regions under selection, which include already reported regions but also several new ones. We propose a method to help identifying the population(s) under selection in a detected region, which reveals that in many of these regions selection most likely occurred in more than one population. Furthermore, several of the detected regions correspond to incomplete sweeps, where the favorable haplotype is only at intermediate frequency in the population(s) under selection.


Molecular Biology and Evolution | 2012

Detecting Selective Sweeps from Pooled Next-Generation Sequencing Samples

Simon Boitard; Christian Schlötterer; Viola Nolte; Ram Vinay Pandey; Andreas Futschik

Due to its cost effectiveness, next-generation sequencing of pools of individuals (Pool-Seq) is becoming a popular strategy for characterizing variation in population samples. Because Pool-Seq provides genome-wide SNP frequency data, it is possible to use them for demographic inference and/or the identification of selective sweeps. Here, we introduce a statistical method that is designed to detect selective sweeps from pooled data by accounting for statistical challenges associated with Pool-Seq, namely sequencing errors and random sampling among chromosomes. This allows for an efficient use of the information: all base calls are included in the analysis, but the higher credibility of regions with higher coverage and base calls with better quality scores is accounted for. Computer simulations show that our method efficiently detects sweeps even at very low coverage (0.5× per chromosome). Indeed, the power of detecting sweeps is similar to what we could expect from sequences of individual chromosomes. Since the inference of selective sweeps is based on the allele frequency spectrum (AFS), we also provide a method to accurately estimate the AFS provided that the quality scores for the sequence reads are reliable. Applying our approach to Pool-Seq data from Drosophila melanogaster, we identify several selective sweep signatures on chromosome X that include some previously well-characterized sweeps like the wapl region.


Molecular Ecology Resources | 2013

Pool-hmm: a Python program for estimating the allele frequency spectrum and detecting selective sweeps from next generation sequencing of pooled samples

Simon Boitard; Robert Kofler; Pierre Françoise; David Robelin; Christian Schlötterer; Andreas Futschik

Due to its cost effectiveness, next generation sequencing of pools of individuals (Pool‐Seq) is becoming a popular strategy for genome‐wide estimation of allele frequencies in population samples. As the allele frequency spectrum provides information about past episodes of selection, Pool‐seq is also a promising design for genomic scans for selection. However, no software tool has yet been developed for selection scans based on Pool‐Seq data. We introduce Pool‐hmm, a Python program for the estimation of allele frequencies and the detection of selective sweeps in a Pool‐Seq sample. Pool‐hmm includes several options that allow a flexible analysis of Pool‐Seq data, and can be run in parallel on several processors. Source code and documentation for Pool‐hmm is freely available at https://qgsp.jouy.inra.fr/.


PLOS ONE | 2011

Epilepsy caused by an abnormal alternative splicing with dosage effect of the SV2A gene in a chicken model

Marine Douaud; Katia Feve; Fabienne Pituello; David Gourichon; Simon Boitard; Eric LeGuern; Gérard Coquerelle; Agathe Vieaud; Cesira Batini; R. Naquet; Alain Vignal; Michèle Tixier-Boichard; Frédérique Pitel

Photosensitive reflex epilepsy is caused by the combination of an individuals enhanced sensitivity with relevant light stimuli, such as stroboscopic lights or video games. This is the most common reflex epilepsy in humans; it is characterized by the photoparoxysmal response, which is an abnormal electroencephalographic reaction, and seizures triggered by intermittent light stimulation. Here, by using genetic mapping, sequencing and functional analyses, we report that a mutation in the acceptor site of the second intron of SV2A (the gene encoding synaptic vesicle glycoprotein 2A) is causing photosensitive reflex epilepsy in a unique vertebrate model, the Fepi chicken strain, a spontaneous model where the neurological disorder is inherited as an autosomal recessive mutation. This mutation causes an aberrant splicing event and significantly reduces the level of SV2A mRNA in homozygous carriers. Levetiracetam, a second generation antiepileptic drug, is known to bind SV2A, and SV2A knock-out mice develop seizures soon after birth and usually die within three weeks. The Fepi chicken survives to adulthood and responds to levetiracetam, suggesting that the low-level expression of SV2A in these animals is sufficient to allow survival, but does not protect against seizures. Thus, the Fepi chicken model shows that the role of the SV2A pathway in the brain is conserved between birds and mammals, in spite of a large phylogenetic distance. The Fepi model appears particularly useful for further studies of physiopathology of reflex epilepsy, in comparison with induced models of epilepsy in rodents. Consequently, SV2A is a very attractive candidate gene for analysis in the context of both mono- and polygenic generalized epilepsies in humans.


Animal Genetics | 2010

Genetic variability, structure and assignment of Spanish and French pig populations based on a large sampling

Simon Boitard; Claude Chevalet; M.-J. Mercat; J. C. Meriaux; Armand Sánchez; J. Tibau; Magali SanCristobal

The Spanish and French pig populations share the common practice of quasi systematic paternity control of pure breed and composite line males. Ten microsatellite markers are in common between Spain and France controls, among the 17 markers used in France and the 13 used in Spain. After the adjustment of allele sizes, it is possible to merge the two datasets and to obtain a set of 5791 animals, including the vast majority of the males in the Duroc, Landrace, Large White and Piétrain French and Spanish breeds. Twelve French composite lines are also available. The genetic diversity analysis of these pig populations is presented, as well as the assignment of an individual to its breed. The effects of heterogeneous sampling across time and of relatedness among animals are also assessed. Consistent with the results of the previous studies, we found that different populations from the same breed clearly clustered together. In addition, all populations of this study, whether purebred or composite, are quite well differentiated from the other ones. As a result, we note that the 10 microsatellites commonly used for paternity control ensure a powerful detection of the breed of origin, with the power of detection being 95-99%. The detection of the exact population within breed is more difficult, but the power exceeds 70% for most of the populations. Practical implications include, for instance, the detection of outlier animals, crosses and admixture events.


Genetics | 2016

Uncovering Adaptation from Sequence Data: Lessons from Genome Resequencing of Four Cattle Breeds

Simon Boitard; Mekki Boussaha; Aurélien Capitan; Dominique Rocha; Bertrand Servin

Detecting the molecular basis of adaptation is one of the major questions in population genetics. With the advance in sequencing technologies, nearly complete interrogation of genome-wide polymorphisms in multiple populations is becoming feasible in some species, with the expectation that it will extend quickly to new ones. Here, we investigate the advantages of sequencing for the detection of adaptive loci in multiple populations, exploiting a recently published data set in cattle (Bos taurus). We used two different approaches to detect statistically significant signals of positive selection: a within-population approach aimed at identifying hard selective sweeps and a population-differentiation approach that can capture other selection events such as soft or incomplete sweeps. We show that the two methods are complementary in that they indeed capture different kinds of selection signatures. Our study confirmed some of the well-known adaptive loci in cattle (e.g., MC1R, KIT, GHR, PLAG1, NCAPG/LCORL) and detected some new ones (e.g., ARL15, PRLR, CYP19A1, PPM1L). Compared to genome scans based on medium- or high-density SNP data, we found that sequencing offered an increased detection power and a higher resolution in the localization of selection signatures. In several cases, we could even pinpoint the underlying causal adaptive mutation or at least a very small number of possible candidates (e.g., MC1R, PLAG1). Our results on these candidates suggest that a vast majority of adaptive mutations are likely to be regulatory rather than protein-coding variants.


Nature Genetics | 2018

Meta-analysis of genome-wide association studies for cattle stature identifies common genes that regulate body size in mammals

Aniek C. Bouwman; Hans D. Daetwyler; Amanda J. Chamberlain; Carla Hurtado Ponce; Mehdi Sargolzaei; F.S. Schenkel; Goutam Sahana; Armelle Govignon-Gion; Simon Boitard; M. Dolezal; Hubert Pausch; Rasmus Froberg Brøndum; Phil J. Bowman; Bo Thomsen; Bernt Guldbrandtsen; Mogens Sandø Lund; Bertrand Servin; Dorian J. Garrick; James M. Reecy; Johanna Vilkki; A. Bagnato; Min Wang; Jesse L. Hoff; Robert D. Schnabel; Jeremy F. Taylor; Anna A. E. Vinkhuyzen; Frank Panitz; Christian Bendixen; Lars-Erik Holm; Birgit Gredler

Stature is affected by many polymorphisms of small effect in humans1. In contrast, variation in dogs, even within breeds, has been suggested to be largely due to variants in a small number of genes2,3. Here we use data from cattle to compare the genetic architecture of stature to those in humans and dogs. We conducted a meta-analysis for stature using 58,265 cattle from 17 populations with 25.4 million imputed whole-genome sequence variants. Results showed that the genetic architecture of stature in cattle is similar to that in humans, as the lead variants in 163 significantly associated genomic regions (P < 5 × 10−8) explained at most 13.8% of the phenotypic variance. Most of these variants were noncoding, including variants that were also expression quantitative trait loci (eQTLs) and in ChIP–seq peaks. There was significant overlap in loci for stature with humans and dogs, suggesting that a set of common genes regulates body size in mammals.Meta-analysis of data from 58,265 cattle shows that the genetic architecture underlying stature is similar to that in humans, where many genomic regions individually explain only a small amount of phenotypic variance.

Collaboration


Dive into the Simon Boitard's collaboration.

Top Co-Authors

Avatar

Bertrand Servin

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Magali SanCristobal

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Brigitte Mangin

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

David Gourichon

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

David Robelin

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Frédérique Pitel

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Jihad Abdallah

North Carolina State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Christine Cierco-Ayrolles

Institut national de la recherche agronomique

View shared research outputs
Researchain Logo
Decentralizing Knowledge