Guillaume Achaz
Collège de France
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guillaume Achaz.
Molecular Ecology | 2012
N. Puillandre; Amaury Lambert; S. Brouillet; Guillaume Achaz
Within uncharacterized groups, DNA barcodes, short DNA sequences that are present in a wide range of species, can be used to assign organisms into species. We propose an automatic procedure that sorts the sequences into hypothetical species based on the barcode gap, which can be observed whenever the divergence among organisms belonging to the same species is smaller than divergence among organisms from different species. We use a range of prior intraspecific divergence to infer from the data a model‐based one‐sided confidence limit for intraspecific divergence. The method, called Automatic Barcode Gap Discovery (ABGD), then detects the barcode gap as the first significant gap beyond this limit and uses it to partition the data. Inference of the limit and gap detection are then recursively applied to previously obtained groups to get finer partitions until there is no further partitioning. Using six published data sets of metazoans, we show that ABGD is computationally efficient and performs well for standard prior maximum intraspecific divergences (a few per cent of divergence for the five data sets), except for one data set where less than three sequences per species were sampled. We further explore the theoretical limitations of ABGD through simulation of explicit speciation and population genetics scenarios. Our results emphasize in particular the sensitivity of the method to the presence of recent speciation events, via (unrealistically) high rates of speciation or large numbers of species. In conclusion, ABGD is fast, simple method to split a sequence alignment data set into candidate species that should be complemented with other evidence in an integrative taxonomic approach.
Nature Genetics | 2013
Saeko Ishida; Fabienne Picard; Gabrielle Rudolf; Eric Noé; Guillaume Achaz; Pierre Thomas; Pierre Genton; Emeline Mundwiller; Markus Wolff; Christian Marescaux; Richard B. Miles; Michel Baulac; Edouard Hirsch; Eric LeGuern; Stéphanie Baulac
The main familial focal epilepsies are autosomal dominant nocturnal frontal lobe epilepsy, familial temporal lobe epilepsy and familial focal epilepsy with variable foci. A frameshift mutation in the DEPDC5 gene (encoding DEP domain–containing protein 5) was identified in a family with focal epilepsy with variable foci by linkage analysis and exome sequencing. Subsequent pyrosequencing of DEPDC5 in a cohort of 15 additional families with focal epilepsies identified 4 nonsense mutations and 1 missense mutation. Our findings provided evidence of frequent (37%) loss-of-function mutations in DEPDC5 associated with a broad spectrum of focal epilepsies. The implication of a DEP (Dishevelled, Egl-10 and Pleckstrin) domain–containing protein that may be involved in membrane trafficking and/or G protein signaling opens new avenues for research.
Genetics | 2009
Guillaume Achaz
Neutrality tests based on the frequency spectrum (e.g., Tajimas D or Fu and Lis F) are commonly used by population geneticists as routine tests to assess the goodness-of-fit of the standard neutral model on their data sets. Here, I show that these neutrality tests are specific instances of a general model that encompasses them all. I illustrate how this general framework can be taken advantage of to devise new more powerful tests that better detect deviations from the standard model. Finally, I exemplify the usefulness of the framework on SNP data by showing how it supports the selection hypothesis in the lactase human gene by overcoming the ascertainment bias. The framework presented here paves the way for constructing novel tests optimized for specific violations of the standard model that ultimately will help to unravel scenarios of evolution.
Nucleic Acids Research | 2002
Guillaume Achaz; Eduardo P. C. Rocha; Pierre Netter; Eric Coissac
We investigated 53 complete bacterial chromosomes for intrachromosomal repeats. In previous studies on eukaryote chromosomes, we proposed a model for the dynamics of repeats based on the continuous genesis of tandem repeats, followed by an active process of high deletion rate, counteracted by rearrangement events that may prevent the repeats from being deleted. The present study of long repeats in the genomes of Bacteria and Archaea suggests that our model of interspersed repeats dynamics may apply to them. Thus the duplication process might be a consequence of very ancient mechanisms shared by all three domains. Moreover, we show that there is a strong negative correlation between nucleotide composition bias and the repeat density of genomes. We hypothesise that in highly biased genomes, non-duplicated small repeats arise more frequently by random effects and are used as primers for duplication mechanisms, leading to a higher density of large repeats.
Applied and Environmental Microbiology | 2008
Pierre Nicolas; Stanislas Mondot; Guillaume Achaz; Catherine Bouchenot; Jean-François Bernardet; Eric Duchaud
ABSTRACT Flavobacterium psychrophilum is currently one of the main bacterial pathogens hampering the productivity of salmonid farming worldwide, and its control mainly relies on antibiotic treatments. To better understand the population structure of this bacterium and its mode of evolution, we have examined the nucleotide polymorphisms at 11 protein-coding loci of the core genome in a set of 50 isolates. These isolates were selected to represent the broadest possible diversity, originating from 10 different host fish species and four continents. The nucleotide diversity between pairs of sequences amounted to fewer than four differences per kilobase on average, corresponding to a particularly low level of diversity, possibly indicative of a small effective-population size. The recombination rate, however, seemed remarkably high, and as a consequence, most of the isolates harbored unique combinations of alleles (33 distinct sequence types were resolved). The analysis also showed the existence of several clonal complexes with worldwide geographic distribution but marked association with particular fish species. Such an association could reflect preferential routes of transmission and/or adaptive niche specialization. The analysis provided no clues that the initial range of the bacterium was originally limited to North America. Instead, the historical record of the expansion of the pathogen may reflect the spread of a few clonal complexes. As a resource for future epidemiological surveys, a multilocus sequence typing website based on seven highly informative loci is available.
Bioinformatics | 2007
Guillaume Achaz; Frédéric Boyer; Eduardo P. C. Rocha; Alain Viari; Eric Coissac
UNLABELLED Chromosomes or other long DNA sequences contain many highly similar repeated sub-sequences. While there are efficient methods for detecting strict repeats or detecting already characterized repeats, there is no software available for detecting approximate repeats in large DNA sequences allowing for weighted substitutions and indels in a coherent statistical framework. Here, we present an implementation of a two-steps method (seed detection followed by their extension) that detects those approximate repeats. Our method is computationally efficient enough to handle large sequences and is flexible enough to account for influencing factors, such as sequence-composition biases both at the seed detection and alignment levels. AVAILABILITY http://wwwabi.snv.jussieu.fr/public/RepSeek/
Genetics | 2008
Guillaume Achaz
Many data sets one could use for population genetics contain artifactual sites, i.e., sequencing errors. Here, we first explore the impact of such errors on several common summary statistics, assuming that sequencing errors are mostly singletons. We thus show that in the presence of those errors, estimators of θ can be strongly biased. We further show that even with a moderate number of sequencing errors, neutrality tests based on the frequency spectrum reject neutrality. This implies that analyses of data sets with such errors will systematically lead to wrong inferences of evolutionary scenarios. To avoid to these errors, we propose two new estimators of θ that ignore singletons as well as two new tests Y and Y* that can be used to test neutrality despite sequencing errors. All in all, we show that even though singletons are ignored, these new tests show some power to detect deviations from a standard neutral model. We therefore advise the use of these new tests to strengthen conclusions in suspicious data sets.
Proceedings of the National Academy of Sciences of the United States of America | 2015
Claire Régnier; Guillaume Achaz; Amaury Lambert; Robert H. Cowie; Philippe Bouchet; Benoît Fontaine
Significance Since the 1980s, many biologists have concluded that the earth is in the midst of a massive biodiversity extinction crisis caused by human activities. Yet fewer than 1,000 of the planet’s 1.9 million known species are officially recorded as extinct. Skeptics have therefore asked “Is there really a crisis?” Mammals and birds provide the most robust data, because the status of almost all has been assessed. Invertebrates constitute over 99% of species diversity, but the status of only a tiny fraction has been assessed, thereby dramatically underestimating overall levels of extinction. Using data on terrestrial invertebrates, this study estimates that we may already have lost 7% of the species on Earth and that the biodiversity crisis is real. Since the 1980s, many have suggested we are in the midst of a massive extinction crisis, yet only 799 (0.04%) of the 1.9 million known recent species are recorded as extinct, questioning the reality of the crisis. This low figure is due to the fact that the status of very few invertebrates, which represent the bulk of biodiversity, have been evaluated. Here we show, based on extrapolation from a random sample of land snail species via two independent approaches, that we may already have lost 7% (130,000 extinctions) of the species on Earth. However, this loss is masked by the emphasis on terrestrial vertebrates, the target of most conservation actions. Projections of species extinction rates are controversial because invertebrates are essentially excluded from these scenarios. Invertebrates can and must be assessed if we are to obtain a more realistic picture of the sixth extinction crisis.
The Journal of Infectious Diseases | 2008
Geetha Kutty; Frank Maldarelli; Guillaume Achaz; Joseph A. Kovacs
The genome of Pneumocystis, which causes life-threatening pneumonia in immunosuppressed patients, contains a multicopy gene family that encodes the major surface glycoprotein (Msg). Pneumocystis can vary the expressed Msg, presumably as a mechanism to avoid host immune responses. Analysis of 24 msg-gene sequences obtained from a single human isolate of Pneumocystis demonstrated that the sequences segregate into 2 branches. Results of a number of analyses suggest that recombination between msg genes is an important mechanism for generating msg diversity. Intrabranch recombination occurred more frequently than interbranch recombination. Restriction-fragment length polymorphism analysis of human isolates of Pneumocystis demonstrated substantial variation in the repertoire of the msg-gene family, variation that was not observed in laboratory isolates of Pneumocystis in rats or mice; this may be the result of examining outbred versus captive populations. Increased diversity in the Msg repertoire, generated in part by recombination, increases the potential for antigenic variation in this abundant surface protein.
Evolution | 2014
François Blanquart; Guillaume Achaz; Thomas Bataillon; Olivier Tenaillon
The fitness landscape—the mapping between genotypes and fitness—determines properties of the process of adaptation. Several small genotypic fitness landscapes have recently been built by selecting a handful of beneficial mutations and measuring fitness of all combinations of these mutations. Here, we generate several testable predictions for the properties of these small genotypic landscapes under Fishers geometric model of adaptation. When the ancestral strain is far from the fitness optimum, we analytically compute the fitness effect of selected mutations and their epistatic interactions. Epistasis may be negative or positive on average depending on the distance of the ancestral genotype to the optimum and whether mutations were independently selected, or coselected in an adaptive walk. Simulations show that genotypic landscapes built from Fishers model are very close to an additive landscape when the ancestral strain is far from the optimum. However, when it is close to the optimum, a large diversity of landscape with substantial roughness and sign epistasis emerged. Strikingly, small genotypic landscapes built from several replicate adaptive walks on the same underlying landscape were highly variable, suggesting that several realizations of small genotypic landscapes are needed to gain information about the underlying architecture of the fitness landscape.