Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vincent Daubin is active.

Publication


Featured researches published by Vincent Daubin.


Nature | 2006

Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia

Jean-Marc Aury; Olivier Jaillon; Laurent Duret; Benjamin Noel; Claire Jubin; Betina M. Porcel; Béatrice Segurens; Vincent Daubin; Véronique Anthouard; Nathalie Aiach; Olivier Arnaiz; Alain Billaut; Janine Beisson; Isabelle Blanc; Khaled Bouhouche; Francisco Câmara; Sandra Duharcourt; Roderic Guigó; Delphine Gogendeau; Michael Katinka; Anne-Marie Keller; Roland Kissmehl; Catherine Klotz; Anne Le Mouël; Gersende Lepère; Sophie Malinsky; Mariusz Nowacki; Jacek K. Nowak; Helmut Plattner; Julie Poulain

The duplication of entire genomes has long been recognized as having great potential for evolutionary novelties, but the mechanisms underlying their resolution through gene loss are poorly understood. Here we show that in the unicellular eukaryote Paramecium tetraurelia, a ciliate, most of the nearly 40,000 genes arose through at least three successive whole-genome duplications. Phylogenetic analysis indicates that the most recent duplication coincides with an explosion of speciation events that gave rise to the P. aurelia complex of 15 sibling species. We observed that gene loss occurs over a long timescale, not as an initial massive event. Genes from the same metabolic pathway or protein complex have common patterns of gene loss, and highly expressed genes are over-retained after all duplications. The conclusion of this analysis is that many genes are maintained after whole-genome duplication not because of functional innovation but because of gene dosage constraints.


PLOS Biology | 2003

From Gene Trees to Organismal Phylogeny in Prokaryotes:The Case of the γ-Proteobacteria

Emmanuelle Lerat; Vincent Daubin; Nancy A. Moran

The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the γ-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the γ-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.


PLOS Biology | 2005

Evolutionary Origins of Genomic Repertoires in Bacteria

Emmanuelle Lerat; Vincent Daubin; Howard Ochman; Nancy A. Moran

Explaining the diversity of gene repertoires has been a major problem in modern evolutionary biology. In eukaryotes, this diversity is believed to result mainly from gene duplication and loss, but in prokaryotes, lateral gene transfer (LGT) can also contribute substantially to genome contents. To determine the histories of gene inventories, we conducted an exhaustive analysis of gene phylogenies for all gene families in a widely sampled group, the γ-Proteobacteria. We show that, although these bacterial genomes display striking differences in gene repertoires, most gene families having representatives in several species have congruent histories. Other than the few vast multigene families, gene duplication has contributed relatively little to the contents of these genomes; instead, LGT, over time, provides most of the diversity in genomic repertoires. Most such acquired genes are lost, but the majority of those that persist in genomes are transmitted strictly vertically. Although our analyses are limited to the γ-Proteobacteria, these results resolve a long-standing paradox—i.e., the ability to make robust phylogenetic inferences in light of substantial LGT.


Genome Research | 2013

Genome-scale coestimation of species and gene trees

Bastien Boussau; Gergely J. Szöllősi; Laurent Duret; Manolo Gouy; Eric Tannier; Vincent Daubin

Comparisons of gene trees and species trees are key to understanding major processes of genome evolution such as gene duplication and loss. Because current methods to reconstruct phylogenies fail to model the two-way dependency between gene trees and the species tree, they often misrepresent gene and species histories. We present a new probabilistic model to jointly infer rooted species and gene trees for dozens of genomes and thousands of gene families. We use simulations to show that this method accurately infers the species tree and gene trees, is robust to misspecification of the models of sequence and gene family evolution, and provides a precise historic record of gene duplications and losses throughout genome evolution. We simultaneously reconstruct the history of mammalian species and their genes based on 36 completely sequenced genomes, and use the reconstructed gene trees to infer the gene content and organization of ancestral mammalian genomes. We show that our method yields a more accurate picture of ancestral genomes than the trees available in the authoritative database Ensembl.


Philosophical Transactions of the Royal Society B | 2008

Dealing with incongruence in phylogenomic analyses

Nicolas Galtier; Vincent Daubin

Incongruence between gene trees is the main challenge faced by phylogeneticists in the genomic era. Incongruence can occur for artefactual reasons, when we fail to recover the correct gene trees, or for biological reasons, when true gene trees are actually distinct from each other, and from the species tree. Horizontal gene transfers (HGTs) between genomes are an important process of bacterial evolution resulting in a substantial amount of phylogenetic conflicts between gene trees. We argue that the (bacterial) species tree is still a meaningful scientific concept even in the case of HGTs, and that reconstructing it is still a valid goal. We tentatively assess the amount of phylogenetic incongruence caused by HGTs in bacteria by comparing bacterial datasets to a metazoan dataset in which transfers are presumably very scarce or absent. We review existing phylogenomic methods and their ability to return to the user, both the vertical (speciation/extinction history) and horizontal (gene transfers) phylogenetic signals.


Nature Reviews Microbiology | 2010

The origin of eukaryotes and their relationship with the Archaea: are we at a phylogenomic impasse?

Simonetta Gribaldo; Anthony M. Poole; Vincent Daubin; Patrick Forterre; Céline Brochier-Armanet

The origin of eukaryotes and their evolutionary relationship with the Archaea is a major biological question and the subject of intense debate. In the context of the classical view of the universal tree of life, the Archaea and the Eukarya have a common ancestor, the nature of which remains undetermined. Alternative views propose instead that the Eukarya evolved directly from a bona fide archaeal lineage. Several recent large-scale phylogenomic studies using an array of approaches are divided in supporting either one or the other scenario, despite analysing largely overlapping data sets of universal genes. We examine the reasons for such a lack of consensus and consider how alternative approaches may enable progress in answering this fascinating and as-yet-unresolved question.


BMC Bioinformatics | 2009

Databases of homologous gene families for comparative genomics

Simon Penel; Anne-Muriel Arigon; Jean-François Dufayard; Anne-Sophie Sertier; Vincent Daubin; Laurent Duret; Manolo Gouy; Guy Perrière

BackgroundComparative genomics is a central step in many sequence analysis studies, from gene annotation and the identification of new functional regions in genomes, to the study of evolutionary processes at the molecular level (speciation, single gene or whole genome duplications, etc.) and phylogenetics. In that context, databases providing users high quality homologous families and sequence alignments as well as phylogenetic trees based on state of the art algorithms are becoming indispensable.MethodsWe developed an automated procedure allowing massive all-against-all similarity searches, gene clustering, multiple alignments computation, and phylogenetic trees construction and reconciliation. The application of this procedure to a very large set of sequences is possible through parallel computing on a large computer cluster.ResultsThree databases were developed using this procedure: HOVERGEN, HOGENOM and HOMOLENS. These databases share the same architecture but differ in their content. HOVERGEN contains sequences from vertebrates, HOGENOM is mainly devoted to completely sequenced microbial organisms, and HOMOLENS is devoted to metazoan genomes from Ensembl. Access to the databases is provided through Web query forms, a general retrieval system and a client-server graphical interface. The later can be used to perform tree-pattern based searches allowing, among other uses, to retrieve sets of orthologous genes. The three databases, as well as the software required to build and query them, can be used or downloaded from the PBIL (Pôle Bioinformatique Lyonnais) site at http://pbil.univ-lyon1.fr/.


Briefings in Bioinformatics | 2011

Models, algorithms and programs for phylogeny reconciliation

Jean-Philippe Doyon; Vincent Ranwez; Vincent Daubin; Vincent Berry

Gene sequences contain a gold mine of phylogenetic information. But unfortunately for taxonomists this information does not only tell the story of the species from which it was collected. Genes have their own complex histories which record speciation events, of course, but also many other events. Among them, gene duplications, transfers and losses are especially important to identify. These events are crucial to account for when reconstructing the history of species, and they play a fundamental role in the evolution of genomes, the diversification of organisms and the emergence of new cellular functions. We review reconciliations between gene and species trees, which are rigorous approaches for identifying duplications, transfers and losses that mark the evolution of a gene family. Existing reconciliation models and algorithms are reviewed and difficulties in modeling gene transfers are discussed. We also compare different reconciliation programs along with their advantages and disadvantages.


Proceedings of the National Academy of Sciences of the United States of America | 2012

Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations.

Gergely J. Szollosi; Bastien Boussau; Sophie S. Abby; Eric Tannier; Vincent Daubin

The timing of the evolution of microbial life has largely remained elusive due to the scarcity of prokaryotic fossil record and the confounding effects of the exchange of genes among possibly distant species. The history of gene transfer events, however, is not a series of individual oddities; it records which lineages were concurrent and thus provides information on the timing of species diversification. Here, we use a probabilistic model of genome evolution that accounts for differences between gene phylogenies and the species tree as series of duplication, transfer, and loss events to reconstruct chronologically ordered species phylogenies. Using simulations we show that we can robustly recover accurate chronologically ordered species phylogenies in the presence of gene tree reconstruction errors and realistic rates of duplication, transfer, and loss. Using genomic data we demonstrate that we can infer rooted species phylogenies using homologous gene families from complete genomes of 10 bacterial and archaeal groups. Focusing on cyanobacteria, distinguished among prokaryotes by a relative abundance of fossils, we infer the maximum likelihood chronologically ordered species phylogeny based on 36 genomes with 8,332 homologous gene families. We find the order of speciation events to be in full agreement with the fossil record and the inferred phylogeny of cyanobacteria to be consistent with the phylogeny recovered from established phylogenomics methods. Our results demonstrate that lateral gene transfers, detected by probabilistic models of genome evolution, can be used as a source of information on the timing of evolution, providing a valuable complement to the limited prokaryotic fossil record.


Systematic Biology | 2015

The Inference of Gene Trees with Species Trees

Gergely J. Szöllősi; Eric Tannier; Vincent Daubin; Bastien Boussau

This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree–species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree–species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution.

Collaboration


Dive into the Vincent Daubin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Howard Ochman

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge