Vincent Berry | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vincent Berry is active.

Explore More

Publication

Featured researches published by Vincent Berry.

Briefings in Bioinformatics | 2011

Models, algorithms and programs for phylogeny reconciliation

Jean-Philippe Doyon; Vincent Ranwez; Vincent Daubin; Vincent Berry

Gene sequences contain a gold mine of phylogenetic information. But unfortunately for taxonomists this information does not only tell the story of the species from which it was collected. Genes have their own complex histories which record speciation events, of course, but also many other events. Among them, gene duplications, transfers and losses are especially important to identify. These events are crucial to account for when reconstructing the history of species, and they play a fundamental role in the evolution of genomes, the diversification of organisms and the emergence of new cellular functions. We review reconciliations between gene and species trees, which are rigorous approaches for identifying duplications, transfers and losses that mark the evolution of a gene family. Existing reconciliation models and algorithms are reviewed and difficulties in modeling gene transfers are discussed. We also compare different reconciliation programs along with their advantages and disadvantages.

Theoretical Computer Science | 2000

Inferring evolutionary trees with strong combinatorial evidence

Vincent Berry

We consider the problem of inferring the evolutionary tree of a set of n species. We propose a quartet reconstruction method which specifically produces trees whose edges have strong combinatorial evidence. Let Q be a set of resolved quartets defined on the studied species, the method computes the unique maximum subset Q∗ of Q which is equivalent to a tree and outputs the corresponding tree as an estimate of the species’ phylogeny. We use a characterization of the subset Q∗ due to Bandelt and Dress (Adv. Appl. Math. 7 (1986) 309–343) to provide an O(n4) incremental algorithm for this variant of the NP-hard quartet consistency problem. Moreover, when chosing the resolution of the quartets by the four-point method (FPM) and considering the Cavender–Farris model of evolution, we show that the convergence rate of the Q∗ method is at worst polynomial when the maximum evolutive distance between two species is bounded. We complete these theoretical results by an experimental study on real and simulated data sets. The results show that (i) as expected, the strong combinatorial constraints it imposes on each edge leads the Q∗ method to propose very few incorrect edges; (ii) more surprisingly; the method infers trees with a relatively high degree of resolution.

Nucleic Acids Research | 2012

Sequencing of the smallest Apicomplexan genome from the human pathogen Babesia microti

Emmanuel Cornillot; Kamel Hadj-Kaddour; Amina Dassouli; Benjamin Noel; Vincent Ranwez; Benoit Vacherie; Yoann Augagneur; Virginie Bres; Aurelie Duclos; Sylvie Randazzo; B. Carcy; Françoise Debierre-Grockiego; Stephane Delbecq; Karina Moubri-Ménage; Hosam Shams-Eldin; Sahar Usmani-Brown; Frédéric Bringaud; Patrick Wincker; Christian P. Vivarès; Ralph T. Schwarz; Theo Schetters; Peter J. Krause; A. Gorenflot; Vincent Berry; Valérie Barbe; Choukri Ben Mamoun

We have sequenced the genome of the emerging human pathogen Babesia microti and compared it with that of other protozoa. B. microti has the smallest nuclear genome among all Apicomplexan parasites sequenced to date with three chromosomes encoding ∼3500 polypeptides, several of which are species specific. Genome-wide phylogenetic analyses indicate that B. microti is significantly distant from all species of Babesidae and Theileridae and defines a new clade in the phylum Apicomplexa. Furthermore, unlike all other Apicomplexa, its mitochondrial genome is circular. Genome-scale reconstruction of functional networks revealed that B. microti has the minimal metabolic requirement for intraerythrocytic protozoan parasitism. B. microti multigene families differ from those of other protozoa in both the copy number and organization. Two lateral transfer events with significant metabolic implications occurred during the evolution of this parasite. The genomic sequencing of B. microti identified several targets suitable for the development of diagnostic assays and novel therapies for human babesiosis.

Bioinformatics | 2009

Computing galled networks from real data

Daniel H. Huson; Regula Rupp; Vincent Berry; Philippe Gambette; Christophe Paul

Motivation: Developing methods for computing phylogenetic networks from biological data is an important problem posed by molecular evolution and much work is currently being undertaken in this area. Although promising approaches exist, there are no tools available that biologists could easily and routinely use to compute rooted phylogenetic networks on real datasets containing tens or hundreds of taxa. Biologists are interested in clades, i.e. groups of monophyletic taxa, and these are usually represented by clusters in a rooted phylogenetic tree. The problem of computing an optimal rooted phylogenetic network from a set of clusters, is hard, in general. Indeed, even the problem of just determining whether a given network contains a given cluster is hard. Hence, some researchers have focused on topologically restricted classes of networks, such as galled trees and level-k networks, that are more tractable, but have the practical draw-back that a given set of clusters will usually not possess such a representation. Results: In this article, we argue that galled networks (a generalization of galled trees) provide a good trade-off between level of generality and tractability. Any set of clusters can be represented by some galled network and the question whether a cluster is contained in such a network is easy to solve. Although the computation of an optimal galled network involves successively solving instances of two different NP-complete problems, in practice our algorithm solves this problem exactly on large datasets containing hundreds of taxa and many reticulations in seconds, as illustrated by a dataset containing 279 prokaryotes. Availability: We provide a fast, robust and easy-to-use implementation of this work in version 2.0 of our tree-handling software Dendroscope, freely available from http://www.dendroscope.org. Contact: [email protected]

Systematics and Biodiversity | 2014

Multiple nuclear genes stabilize the phylogenetic backbone of the genus Quercus

François Hubert; Guido W. Grimm; Emmanuelle Jousselin; Vincent Berry; Alain Franc; Antoine Kremer

Phylogenetic relationships among 108 oak species (genus Quercus L.) were inferred using DNA sequences of six nuclear genes selected from the existing genomic resources of the genus. Previous phylogenetic reconstructions based on traditional molecular markers are inconclusive at the deeper nodes. Overall, weak phylogenetic signals were obtained for each individual gene analysis, but stronger signals were obtained when gene sequences were concatenated. Our data support the recognition of six major intrageneric groups Cyclobalanopsis, Cerris, Ilex, Quercus, Lobatae and Protobalanus. Our analyses provide resolution at deeper nodes but with moderate support and a more robust infrageneric classification within the two major clades, the ‘Old World Oaks’ (Cyclobalanopsis, Cerris, Ilex) and ‘New World Oaks’ (Quercus, Lobatae, Protobalanus). However, depending on outgroup choice, our analysis yielded two alternative placements of the Cyclobalanopsis clade within the genus Quercus. When Castanea Mill. was chosen as outgroup, our data suggested that the genus Quercus comprised two clades corresponding to two subgenera as traditionally recognized by Camus: subgenus Euquercus Hickel and Camus and subgenus Cyclobalanopsis Øersted (Schneider). However, when Notholithocarpus Manos, Cannon and S. Oh was chosen as an outgroup subgenus Cyclobalanopsis clustered with Cerris and Ilex groups to form the Old World clade. To assess the placement of the root, we complemented our dataset with published data of ITS and CRC sequences. Based on the concatenated eight gene sequences, the most likely root position is at the split between the ‘Old World Oaks’ and the ‘New World Oaks’, which is one of the alternative positions suggested by our six gene analysis. Using a dating approach, we inferred an Eocene age for the primary divergences in Quercus and a root age of about 50–55 Ma, which agrees with palaeobotanical evidence. Finally, irrespective of the outgroup choice, our data boost the topology within the New World clade, where (Protobalanus + Quercus) is a sister clade of Lobatae. Inferred divergence ages within this clade and the Cerris–Ilex clade are generally younger than could be expected from the fossil record, indicating that morphological differentiation pre-dates genetic isolation in this clade.

Systematic Biology | 2007

PhySIC: A Veto Supertree Method with Desirable Properties

Vincent Ranwez; Vincent Berry; Alexis Criscuolo; Pierre-Henri Fabre; Sylvain Guillemot; Celine Scornavacca; Emmanuel J. P. Douzery

This paper focuses on veto supertree methods; i.e., methods that aim at producing a conservative synthesis of the relationships agreed upon by all source trees. We propose desirable properties that a supertree should satisfy in this framework, namely the non-contradiction property (PC) and the induction property (PI). The former requires that the supertree does not contain relationships that contradict one or a combination of the source topologies, whereas the latter requires that all topological information contained in the supertree is present in a source tree or collectively induced by several source trees. We provide simple examples to illustrate their relevance and that allow a comparison with previously advocated properties. We show that these properties can be checked in polynomial time for any given rooted supertree. Moreover, we introduce the PhySIC method (PHYlogenetic Signal with Induction and non-Contradiction). For k input trees spanning a set of n taxa, this method produces a supertree that satisfies the above-mentioned properties in O(kn(3) + n(4)) computing time. The polytomies of the produced supertree are also tagged by labels indicating areas of conflict as well as those with insufficient overlap. As a whole, PhySIC enables the user to quickly summarize consensual information of a set of trees and localize groups of taxa for which the data require consolidation. Lastly, we illustrate the behaviour of PhySIC on primate data sets of various sizes, and propose a supertree covering 95% of all primate extant genera. The PhySIC algorithm is available at http://atgc.lirmm.fr/cgi-bin/PhySIC.

BMC Bioinformatics | 2008

PhySIC_IST: cleaning source trees to infer more informative supertrees

Celine Scornavacca; Vincent Berry; Vincent Lefort; Emmanuel J. P. Douzery; Vincent Ranwez

BackgroundSupertree methods combine phylogenies with overlapping sets of taxa into a larger one. Topological conflicts frequently arise among source trees for methodological or biological reasons, such as long branch attraction, lateral gene transfers, gene duplication/loss or deep gene coalescence. When topological conflicts occur among source trees, liberal methods infer supertrees containing the most frequent alternative, while veto methods infer supertrees not contradicting any source tree, i.e. discard all conflicting resolutions. When the source trees host a significant number of topological conflicts or have a small taxon overlap, supertree methods of both kinds can propose poorly resolved, hence uninformative, supertrees.ResultsTo overcome this problem, we propose to infer non-plenary supertrees, i.e. supertrees that do not necessarily contain all the taxa present in the source trees, discarding those whose position greatly differs among source trees or for which insufficient information is provided. We detail a variant of the PhySIC veto method called PhySIC_IST that can infer non-plenary supertrees. PhySIC_IST aims at inferring supertrees that satisfy the same appealing theoretical properties as with PhySIC, while being as informative as possible under this constraint. The informativeness of a supertree is estimated using a variation of the CIC (Cladistic Information Content) criterion, that takes into account both the presence of multifurcations and the absence of some taxa. Additionally, we propose a statistical preprocessing step called STC (Source Trees Correction) to correct the source trees prior to the supertree inference. STC is a liberal step that removes the parts of each source tree that significantly conflict with other source trees. Combining STC with a veto method allows an explicit trade-off between veto and liberal approaches, tuned by a single parameter.Performing large-scale simulations, we observe that STC+PhySIC_IST infers much more informative supertrees than PhySIC, while preserving low type I error compared to the well-known MRP method. Two biological case studies on animals confirm that the STC preprocess successfully detects anomalies in the source trees while STC+PhySIC_IST provides well-resolved supertrees agreeing with current knowledge in systematics.ConclusionThe paper introduces and tests two new methodologies, PhySIC_IST and STC, that demonstrate the interest in inferring non-plenary supertrees as well as preprocessing the source trees. An implementation of the methods is available at: http://www.atgc-montpellier.fr/physic_ist/.

PLOS Neglected Tropical Diseases | 2013

Genetic Structure and Evolution of the Leishmania Genus in Africa and Eurasia: What Does MLSA Tell Us

Fouad El Baidouri; Laure Diancourt; Vincent Berry; François Chevenet; Francine Pratlong; P. Marty; Christophe Ravel

Leishmaniasis is a complex parasitic disease from a taxonomic, clinical and epidemiological point of view. The role of genetic exchanges has been questioned for over twenty years and their recent experimental demonstration along with the identification of interspecific hybrids in natura has revived this debate. After arguing that genetic exchanges were exceptional and did not contribute to Leishmania evolution, it is currently proposed that interspecific exchanges could be a major driving force for rapid adaptation to new reservoirs and vectors, expansion into new parasitic cycles and adaptation to new life conditions. To assess the existence of gene flows between species during evolution we used MLSA-based (MultiLocus Sequence Analysis) approach to analyze 222 Leishmania strains from Africa and Eurasia to accurately represent the genetic diversity of this genus. We observed a remarkable congruence of the phylogenetic signal and identified seven genetic clusters that include mainly independent lineages which are accumulating divergences without any sign of recent interspecific recombination. From a taxonomic point of view, the strong genetic structuration of the different species does not question the current classification, except for species that cause visceral forms of leishmaniasis (L. donovani, L. infantum and L. archibaldi). Although these taxa cause specific clinical forms of the disease and are maintained through different parasitic cycles, they are not clearly distinct and form a continuum, in line with the concept of species complex already suggested for this group thirty years ago. These results should have practical consequences concerning the molecular identification of parasites and the subsequent therapeutic management of the disease.

Journal of Discrete Algorithms | 2007

Maximum agreement and compatible supertrees

Vincent Berry; François Nicolas

Given a set of leaf-labelled trees with identical leaf sets, the MAST problem, respectively MCT problem, consists of finding a largest subset of leaves such that all input trees restricted to these leaves are isomorphic, respectively compatible. In this paper, we propose extensions of these problems to the context of supertree inference, where input trees have non-identical leaf sets. This situation is of particular interest in phylogenetics. The resulting problems are called SMAST and SMCT. A sufficient condition is given that identifies cases where these problems can be solved by resorting to MAST and MCT as subproblems. This condition is met, for instance, when only two input trees are considered. Then we give algorithms for SMAST and SMCT that benefit from the link with the subtree problems. These algorithms run in time linear to the time needed to solve MAST, respectively MCT, on an instance of the same or smaller size. It is shown that arbitrary instances of SMAST and SMCT can be turned in polynomial time into instances composed of trees with a bounded number of leaves. SMAST is shown to be W[2]-hard when the considered parameter is the number of input leaves that have to be removed to obtain the agreement of the input trees. A similar result holds for SMCT. Moreover, the corresponding optimization problems, that is the complements of SMAST and SMCT, cannot be approximated in polynomial time within any constant factor, unless P=NP. These results also hold when the input trees have a bounded number of leaves. The presented results apply to both collections of rooted and unrooted trees.

research in computational molecular biology | 1999

Faster reliable phylogenetic analysis

Vincent Berry; David Bryant

We present fast new algorithms for phylogenetic reconstruction from distance data or weighted quartets. The methods are conservative-they will only return edges that are well supported by the input data. This approach is not only philosophically attractive; the conservative tree estimate can be used as a basis for further tree refinement or divide and conquer algorithms. The capability to process quartet data allows these algorithms to be used in tandem with ordinal or qualitative phylogenetic analysis methods. We provide algorithms for three standard conservative phylogenetic constructions: the Buneman tree, the Refined Buneman tree, and split decomposition. We introduce and exploit combinatorial formalisms involving trees, quartets, and splits, and make particular use of an attractive duality between unrooted trees, splits, and dissimilarities on one hand, and rooted trees, clusters, and similarity measures on the other. Using these techniques, we achieve O(n) improvements in the time complexity of the best previously published algorithms (where n is the number of studied species). Our algorithms will be included in the next edition of the popular Splitslkee software package.

Explore More