Julien Y. Dutheil | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Julien Y. Dutheil is active.

Explore More

Publication

Featured researches published by Julien Y. Dutheil.

Nature | 2012

Insights into hominid evolution from the gorilla genome sequence.

Aylwyn Scally; Julien Y. Dutheil; LaDeana W. Hillier; Gregory Jordan; Ian Goodhead; Javier Herrero; Asger Hobolth; Tuuli Lappalainen; Thomas Mailund; Tomas Marques-Bonet; Shane McCarthy; Stephen H. Montgomery; Petra C. Schwalie; Y. Amy Tang; Michelle C. Ward; Yali Xue; Bryndis Yngvadottir; Can Alkan; Lars Nørvang Andersen; Qasim Ayub; Edward V. Ball; Kathryn Beal; Brenda J. Bradley; Yuan Chen; Chris Clee; Stephen Fitzgerald; Tina Graves; Yong Gu; Paul Heath; Andreas Heger

Gorillas are humans’ closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human–chimpanzee and human–chimpanzee–gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.

Nature | 2011

Comparative and demographic analysis of orang-utan genomes

Devin P. Locke; LaDeana W. Hillier; Wesley C. Warren; Kim C. Worley; Lynne V. Nazareth; Donna M. Muzny; Shiaw-Pyng Yang; Zhengyuan Wang; Asif T. Chinwalla; Patrick Minx; Makedonka Mitreva; Lisa Cook; Kim D. Delehaunty; Catrina C. Fronick; Heather K. Schmidt; Lucinda A. Fulton; Robert S. Fulton; Joanne O. Nelson; Vincent Magrini; Craig S. Pohl; Tina Graves; Chris Markovic; Andy Cree; Huyen Dinh; Jennifer Hume; Christie Kovar; Gerald Fowler; Gerton Lunter; Stephen Meader; Andreas Heger

‘Orang-utan’ is derived from a Malay term meaning ‘man of the forest’ and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000 years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (Ne) expanded exponentially relative to the ancestral Ne after the split, while Bornean Ne declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.

Nature | 2012

The bonobo genome compared with the chimpanzee and human genomes

Kay Prüfer; Kasper Munch; Ines Hellmann; Keiko Akagi; Jason R. Miller; Brian Walenz; Sergey Koren; Granger Sutton; Chinnappa D. Kodira; Roger Winer; James Knight; James C. Mullikin; Stephen Meader; Chris P. Ponting; Gerton Lunter; Saneyuki Higashino; Asger Hobolth; Julien Y. Dutheil; Emre Karakoc; Can Alkan; Saba Sajjadian; Claudia Rita Catacchio; Mario Ventura; Tomas Marques-Bonet; Evan E. Eichler; Claudine André; Rebeca Atencia; Lawrence Mugisha; Jörg Junhold; Nick Patterson

Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other.

PLOS Biology | 2005

Recombination Difference between Sexes: A Role for Haploid Selection

Thomas Lenormand; Julien Y. Dutheil

Why the autosomal recombination rate differs between female and male meiosis in most species has been a genetic enigma since the early study of meiosis. Some hypotheses have been put forward to explain this widespread phenomenon and, up to now, only one fact has emerged clearly: In species in which meiosis is achiasmate in one sex, it is the heterogametic one. This pattern, known as the Haldane-Huxley rule, is thought to be a side effect, on autosomes, of the suppression of recombination between the sex chromosomes. However, this rule does not hold for heterochiasmate species (i.e., species in which recombination is present in both sexes but varies quantitatively between sexes) and does not apply to species lacking sex chromosomes, such as hermaphroditic plants. In this paper, we show that in plants, heterochiasmy is due to a male-female difference in gametic selection and is not influenced by the presence of heteromorphic sex chromosomes. This finding provides strong empirical support in favour of a population genetic explanation for the evolution of heterochiasmy and, more broadly, for the evolution of sex and recombination.

Genome Research | 2011

Incomplete lineage sorting patterns among human, chimpanzee, and orangutan suggest recent orangutan speciation and widespread selection

Asger Hobolth; Julien Y. Dutheil; John Hawks; Mikkel H. Schierup; Thomas Mailund

We search the complete orangutan genome for regions where humans are more closely related to orangutans than to chimpanzees due to incomplete lineage sorting (ILS) in the ancestor of human and chimpanzees. The search uses our recently developed coalescent hidden Markov model (HMM) framework. We find ILS present in ∼1% of the genome, and that the ancestral species of human and chimpanzees never experienced a severe population bottleneck. The existence of ILS is validated with simulations, site pattern analysis, and analysis of rare genomic events. The existence of ILS allows us to disentangle the time of isolation of humans and orangutans (the speciation time) from the genetic divergence time, and we find speciation to be as recent as 9-13 million years ago (Mya; contingent on the calibration point). The analyses provide further support for a recent speciation of human and chimpanzee at ∼4 Mya and a diverse ancestor of human and chimpanzee with an effective population size of about 50,000 individuals. Posterior decoding infers ILS for each nucleotide in the genome, and we use this to deduce patterns of selection in the ancestral species. We demonstrate the effect of background selection in the common ancestor of humans and chimpanzees. In agreement with predictions from population genetics, ILS was found to be reduced in exons and gene-dense regions when we control for confounding factors such as GC content and recombination rate. Finally, we find the broad-scale recombination rate to be conserved through the complete ape phylogeny.

Genome Research | 2011

The making of a new pathogen: insights from comparative population genomics of the domesticated wheat pathogen Mycosphaerella graminicola and its wild sister species.

Eva H. Stukenbrock; Thomas Bataillon; Julien Y. Dutheil; Troels T. Hansen; Ruiqiang Li; Marcello Zala; Bruce A. McDonald; Jun Wang; Mikkel H. Schierup

The fungus Mycosphaerella graminicola emerged as a new pathogen of cultivated wheat during its domestication ~11,000 yr ago. We assembled 12 high-quality full genome sequences to investigate the genetic footprints of selection in this wheat pathogen and closely related sister species that infect wild grasses. We demonstrate a strong effect of natural selection in shaping the pathogen genomes with only ~3% of nonsynonymous mutations being effectively neutral. Forty percent of all fixed nonsynonymous substitutions, on the other hand, are driven by positive selection. Adaptive evolution has affected M. graminicola to the highest extent, consistent with recent host specialization. Positive selection has prominently altered genes encoding secreted proteins and putative pathogen effectors supporting the premise that molecular host-pathogen interaction is a strong driver of pathogen evolution. Recent divergence between pathogen sister species is attested by the high degree of incomplete lineage sorting (ILS) in their genomes. We exploit ILS to generate a genetic map of the species without any crossing data, document recent times of species divergence relative to genome divergence, and show that gene-rich regions or regions with low recombination experience stronger effects of natural selection on neutral diversity. Emergence of a new agricultural host selected a highly specialized and fast-evolving pathogen with unique evolutionary patterns compared with its wild relatives. The strong impact of natural selection, we document, is at odds with the small effective population sizes estimated and suggest that population sizes were historically large but likely unstable.

BMC Bioinformatics | 2006

Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics.

Julien Y. Dutheil; Sylvain Gaillard; Eric Bazin; Sylvain Glémin; Vincent Ranwez; Nicolas Galtier; Khalid Belkhir

BackgroundA large number of bioinformatics applications in the fields of bio-sequence analysis, molecular evolution and population genetics typically share input/ouput methods, data storage requirements and data analysis algorithms. Such common features may be conveniently bundled into re-usable libraries, which enable the rapid development of new methods and robust applications.ResultsWe present Bio++, a set of Object Oriented libraries written in C++. Available components include classes for data storage and handling (nucleotide/amino-acid/codon sequences, trees, distance matrices, population genetics datasets), various input/output formats, basic sequence manipulation (concatenation, transcription, translation, etc.), phylogenetic analysis (maximum parsimony, markov models, distance methods, likelihood computation and maximization), population genetics/genomics (diversity statistics, neutrality tests, various multi-locus analyses) and various algorithms for numerical calculus.ConclusionImplementation of methods aims at being both efficient and user-friendly. A special concern was given to the library design to enable easy extension and new methods development. We defined a general hierarchy of classes that allow the developer to implement its own algorithms while remaining compatible with the rest of the libraries. Bio++ source code is distributed free of charge under the CeCILL general public licence from its website http://kimura.univ-montp2.fr/BioPP.

BMC Evolutionary Biology | 2008

Non-homogeneous models of sequence evolution in the Bio++ suite of libraries and programs

Julien Y. Dutheil; Bastien Boussau

BackgroundAccurately modeling the sequence substitution process is required for the correct estimation of evolutionary parameters, be they phylogenetic relationships, substitution rates or ancestral states; it is also crucial to simulate realistic data sets. Such simulation procedures are needed to estimate the null-distribution of complex statistics, an approach referred to as parametric bootstrapping, and are also used to test the quality of phylogenetic reconstruction programs. It has often been observed that homologous sequences can vary widely in their nucleotide or amino-acid compositions, revealing that sequence evolution has changed importantly among lineages, and may therefore be most appropriately approached through non-homogeneous models. Several programs implementing such models have been developed, but they are limited in their possibilities: only a few particular models are available for likelihood optimization, and data sets cannot be easily generated using the resulting estimated parameters.ResultsWe hereby present a general implementation of non-homogeneous models of substitutions. It is available as dedicated classes in the Bio++ libraries and can hence be used in any C++ program. Two programs that use these classes are also presented. The first one, Bio++ Maximum Likelihood (BppML), estimates parameters of any non-homogeneous model and the second one, Bio++ Sequence Generator (BppSeqGen), simulates the evolution of sequences from these models. These programs allow the user to describe non-homogeneous models through a property file with a simple yet powerful syntax, without any programming required.ConclusionWe show that the general implementation introduced here can accommodate virtually any type of non-homogeneous models of sequence evolution, including heterotachous ones, while being computer efficient. We furthermore illustrate the use of such general models for parametric bootstrapping, using tests of non-homogeneity applied to an already published ribosomal RNA data set.

Proceedings of the National Academy of Sciences of the United States of America | 2012

Fusion of two divergent fungal individuals led to the recent emergence of a unique widespread pathogen species.

Eva H. Stukenbrock; Freddy Bugge Christiansen; Troels T. Hansen; Julien Y. Dutheil; Mikkel H. Schierup

In a genome alignment of five individuals of the ascomycete fungus Zymoseptoria pseudotritici, a close relative of the wheat pathogen Z. tritici (synonym Mycosphaerella graminicola), we observed peculiar diversity patterns. Long regions up to 100 kb without variation alternate with similarly long regions of high variability. The variable segments in the genome alignment are organized into two main haplotype groups that have diverged ∼3% from each other. The genome patterns in Z. pseudotritici are consistent with a hybrid speciation event resulting from a cross between two divergent haploid individuals. The resulting hybrids formed the new species without backcrossing to the parents. We observe no variation in 54% of the genome in the five individuals and estimate a complete loss of variation for at least 30% of the genome in the entire species. A strong population bottleneck following the hybridization event caused this loss of variation. Variable segments in the Z. pseudotritici genome exhibit the two haplotypes contributed by the parental individuals. From our previously estimated recombination map of Z. tritici and the size distribution of variable chromosome blocks untouched by recombination we estimate that the hybridization occurred ∼380 sexual generations ago. We show that the amount of lost variation is explained by genetic drift during the bottleneck and by natural selection, as evidenced by the correlation of presence/absence of variation with gene density and recombination rate. The successful spread of this unique reproductively isolated pathogen highlights the strong potential of hybridization in the emergence of pathogen species with sexual reproduction.

Genetics | 2009

Ancestral Population Genomics: The Coalescent Hidden Markov Model Approach

Julien Y. Dutheil; Ganeshkumar Ganapathy; Asger Hobolth; Thomas Mailund; Marcy K. Uyenoyama; Mikkel H. Schierup

With incomplete lineage sorting (ILS), the genealogy of closely related species differs along their genomes. The amount of ILS depends on population parameters such as the ancestral effective population sizes and the recombination rate, but also on the number of generations between speciation events. We use a hidden Markov model parameterized according to coalescent theory to infer the genealogy along a four-species genome alignment of closely related species and estimate population parameters. We analyze a basic, panmictic demographic model and study its properties using an extensive set of coalescent simulations. We assess the effect of the model assumptions and demonstrate that the Markov property provides a good approximation to the ancestral recombination graph. Using a too restricted set of possible genealogies, necessary to reduce the computational load, can bias parameter estimates. We propose a simple correction for this bias and suggest directions for future extensions of the model. We show that the patterns of ILS along a sequence alignment can be recovered efficiently together with the ancestral recombination rate. Finally, we introduce an extension of the basic model that allows for mutation rate heterogeneity and reanalyze human–chimpanzee–gorilla–orangutan alignments, using the new models. We expect that this framework will prove useful for population genomics and provide exciting insights into genome evolution.

Explore More