Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Vincent Ranwez is active.

Publication


Featured researches published by Vincent Ranwez.


PLOS ONE | 2011

MACSE: Multiple Alignment of Coding SEquences Accounting for Frameshifts and Stop Codons

Vincent Ranwez; Sébastien Harispe; Frédéric Delsuc; Emmanuel J. P. Douzery

Until now the most efficient solution to align nucleotide sequences containing open reading frames was to use indirect procedures that align amino acid translation before reporting the inferred gap positions at the codon level. There are two important pitfalls with this approach. Firstly, any premature stop codon impedes using such a strategy. Secondly, each sequence is translated with the same reading frame from beginning to end, so that the presence of a single additional nucleotide leads to both aberrant translation and alignment. We present an algorithm that has the same space and time complexity as the classical Needleman-Wunsch algorithm while accommodating sequencing errors and other biological deviations from the coding frame. The resulting pairwise coding sequence alignment method was extended to a multiple sequence alignment (MSA) algorithm implemented in a program called MACSE (Multiple Alignment of Coding SEquences accounting for frameshifts and stop codons). MACSE is the first automatic solution to align protein-coding gene datasets containing non-functional sequences (pseudogenes) without disrupting the underlying codon structure. It has also proved useful in detecting undocumented frameshifts in public database sequences and in aligning next-generation sequencing reads/contigs against a reference coding sequence. MACSE is distributed as an open-source java file executable with freely available source code and can be used via a web interface at: http://mbb.univ-montp2.fr/macse.


Trends in Genetics | 2009

GC-biased gene conversion promotes the fixation of deleterious amino acid changes in primates

Nicolas Galtier; Laurent Duret; Sylvain Glémin; Vincent Ranwez

GC-biased gene conversion (gBGC) is a recently discovered, recombination-associated segregation distortion, which influences GC-content dynamics in the mammalian genome. We scanned the primate proteome for examples of exon-specific, lineage-specific accelerated amino acid evolution. Here, we show that such episodes are frequently accompanied by an increase in GC-content, which extends to synonymous and intronic positions. This demonstrates that gBGC has substantially (negatively) impacted the evolutionary trajectory of human proteins by promoting the fixation of deleterious AT-->GC mutations.


Genome Research | 2010

Contrasting GC-content dynamics across 33 mammalian genomes: Relationship with life-history traits and chromosome sizes

Jonathan Romiguier; Vincent Ranwez; Emmanuel J. P. Douzery; Nicolas Galtier

The origin, evolution, and functional relevance of genomic variations in GC content are a long-debated topic, especially in mammals. Most of the existing literature, however, has focused on a small number of model species and/or limited sequence data sets. We analyzed more than 1000 orthologous genes in 33 fully sequenced mammalian genomes, reconstructed their ancestral isochore organization in the maximum likelihood framework, and explored the evolution of third-codon position GC content in representatives of 16 orders and 27 families. We showed that the previously reported erosion of GC-rich isochores is not a general trend. Several species (e.g., shrew, microbat, tenrec, rabbit) have independently undergone a marked increase in GC content, with a widening gap between the GC-poorest and GC-richest classes of genes. The intensively studied apes and (especially) murids do not reflect the general placental pattern. We correlated GC-content evolution with species life-history traits and cytology. Significant effects of body mass and genome size were detected, with each being consistent with the GC-biased gene conversion model.


Molecular Ecology Resources | 2012

Reference-free transcriptome assembly in non-model animals from next-generation sequencing data.

Vincent Cahais; Philippe Gayral; Georgia Tsagkogeorga; José Melo-Ferreira; Marion Ballenghien; Lucy A. Weinert; Ylenia Chiari; Khalid Belkhir; Vincent Ranwez; Nicolas Galtier

Next‐generation sequencing (NGS) technologies offer the opportunity for population genomic study of non‐model organisms sampled in the wild. The transcriptome is a convenient and popular target for such purposes. However, designing genetic markers from NGS transcriptome data requires assembling gene‐coding sequences out of short reads. This is a complex task owing to gene duplications, genetic polymorphism, alternative splicing and transcription noise. Typical assembling programmes return thousands of predicted contigs, whose connection to the species true gene content is unclear, and from which SNP definition is uneasy. Here, the transcriptomes of five diverse non‐model animal species (hare, turtle, ant, oyster and tunicate) were assembled from newly generated 454 and Illumina sequence reads. In two species for which a reference genome is available, a new procedure was introduced to annotate each predicted contig as either a full‐length cDNA, fragment, chimera, allele, paralogue, genomic sequence or other, based on the number of, and overlap between, blast hits to the appropriate reference. Analyses showed that (i) the highest quality assemblies are obtained when 454 and Illumina data are combined, (ii) typical de novo assemblies include a majority of irrelevant cDNA predictions and (iii) assemblies can be appropriately cleaned by filtering contigs based on length and coverage. We conclude that robust, reference‐free assembly of thousands of genes from transcriptomic NGS data is possible, opening promising perspectives for transcriptome‐based population genomics in animals. A Galaxy pipeline implementing our best‐performing assembling strategy is provided.


research in computational molecular biology | 2010

An efficient algorithm for gene/species trees parsimonious reconciliation with losses, duplications and transfers

Jean-Philippe Doyon; Celine Scornavacca; K. Yu. Gorbunov; Gergely J. Szollosi; Vincent Ranwez; Vincent Berry

Tree reconciliation methods aim at estimating the evolutionary events that cause discrepancy between gene trees and species trees. We provide a discrete computational model that considers duplications, transfers and losses of genes. The model yields a fast and exact algorithm to infer time consistent and most parsimonious reconciliations. Then we study the conditions under which parsimony is able to accurately infer such events. Overall, it performs well even under realistic rates, transfers being in general less accurately recovered than duplications. An implementation is freely available at http://www.atgc-montpellier.fr/MPR.


Molecular Biology and Evolution | 2013

Less Is More in Mammalian Phylogenomics: AT-Rich Genes Minimize Tree Conflicts and Unravel the Root of Placental Mammals

Jonathan Romiguier; Vincent Ranwez; Frédéric Delsuc; Nicolas Galtier; Emmanuel J. P. Douzery

Despite the rapid increase of size in phylogenomic data sets, a number of important nodes on animal phylogeny are still unresolved. Among these, the rooting of the placental mammal tree is still a controversial issue. One difficulty lies in the pervasive phylogenetic conflicts among genes, with each one telling its own story, which may be reliable or not. Here, we identified a simple criterion, that is, the GC content, which substantially helps in determining which gene trees best reflect the species tree. We assessed the ability of 13,111 coding sequence alignments to correctly reconstruct the placental phylogeny. We found that GC-rich genes induced a higher amount of conflict among gene trees and performed worse than AT-rich genes in retrieving well-supported, consensual nodes on the placental tree. We interpret this GC effect mainly as a consequence of genome-wide variations in recombination rate. Indeed, recombination is known to drive GC-content evolution through GC-biased gene conversion and might be problematic for phylogenetic reconstruction, for instance, in an incomplete lineage sorting context. When we focused on the AT-richest fraction of the data set, the resolution level of the placental phylogeny was greatly increased, and a strong support was obtained in favor of an Afrotheria rooting, that is, Afrotheria as the sister group of all other placentals. We show that in mammals most conflicts among gene trees, which have so far hampered the resolution of the placental tree, are concentrated in the GC-rich regions of the genome. We argue that the GC content-because it is a reliable indicator of the long-term recombination rate-is an informative criterion that could help in identifying the most reliable molecular markers for species tree inference.


Molecular Ecology | 2011

Reference-free transcriptome assembly in non-model animals from next generation sequencing data

Vincent Cahais; Philippe Gayral; Georgia Tsagkogeorga; Marion Ballenghien; Lucy A. Weinert; Ylenia Chiari; Khalid Belkhir; Vincent Ranwez; Nicolas Galtier

Next‐generation sequencing (NGS) technologies offer the opportunity for population genomic study of non‐model organisms sampled in the wild. The transcriptome is a convenient and popular target for such purposes. However, designing genetic markers from NGS transcriptome data requires assembling gene‐coding sequences out of short reads. This is a complex task owing to gene duplications, genetic polymorphism, alternative splicing and transcription noise. Typical assembling programmes return thousands of predicted contigs, whose connection to the species true gene content is unclear, and from which SNP definition is uneasy. Here, the transcriptomes of five diverse non‐model animal species (hare, turtle, ant, oyster and tunicate) were assembled from newly generated 454 and Illumina sequence reads. In two species for which a reference genome is available, a new procedure was introduced to annotate each predicted contig as either a full‐length cDNA, fragment, chimera, allele, paralogue, genomic sequence or other, based on the number of, and overlap between, blast hits to the appropriate reference. Analyses showed that (i) the highest quality assemblies are obtained when 454 and Illumina data are combined, (ii) typical de novo assemblies include a majority of irrelevant cDNA predictions and (iii) assemblies can be appropriately cleaned by filtering contigs based on length and coverage. We conclude that robust, reference‐free assembly of thousands of genes from transcriptomic NGS data is possible, opening promising perspectives for transcriptome‐based population genomics in animals. A Galaxy pipeline implementing our best‐performing assembling strategy is provided.


BMC Evolutionary Biology | 2007

OrthoMaM: a database of orthologous genomic markers for placental mammal phylogenetics.

Vincent Ranwez; Frédéric Delsuc; Sylvie Ranwez; Khalid Belkhir; Marie-Ka Tilak; Emmanuel J. P. Douzery

BackgroundMolecular sequence data have become the standard in modern day phylogenetics. In particular, several long-standing questions of mammalian evolutionary history have been recently resolved thanks to the use of molecular characters. Yet, most studies have focused on only a handful of standard markers. The availability of an ever increasing number of whole genome sequences is a golden mine for modern systematics. Genomic data now provide the opportunity to select new markers that are potentially relevant for further resolving branches of the mammalian phylogenetic tree at various taxonomic levels.DescriptionThe EnsEMBL database was used to determine a set of orthologous genes from 12 available complete mammalian genomes. As targets for possible amplification and sequencing in additional taxa, more than 3,000 exons of length > 400 bp have been selected, among which 118, 368, 608, and 674 are respectively retrieved for 12, 11, 10, and 9 species. A bioinformatic pipeline has been developed to provide evolutionary descriptors for these candidate markers in order to assess their potential phylogenetic utility. The resulting OrthoMaM (Orthologous Mammalian Markers) database can be queried and alignments can be downloaded through a dedicated web interface http://kimura.univ-montp2.fr/orthomam.ConclusionThe importance of marker choice in phylogenetic studies has long been stressed. Our database centered on complete genome information now makes possible to select promising markers to a given phylogenetic question or a systematic framework by querying a number of evolutionary descriptors. The usefulness of the database is illustrated with two biological examples. First, two potentially useful markers were identified for rodent systematics based on relevant evolutionary parameters and sequenced in additional species. Second, a complete, gapless 94 kb supermatrix of 118 orthologous exons was assembled for 12 mammals. Phylogenetic analyses using probabilistic methods unambiguously supported the new placental phylogeny by retrieving the monophyly of Glires, Euarchontoglires, Laurasiatheria, and Boreoeutheria. Muroid rodents thus do not represent a basal placental lineage as it was mistakenly reasserted in some recent phylogenomic analyses based on fewer taxa. We expect the OrthoMaM database to be useful for further resolving the phylogenetic tree of placental mammals and for better understanding the evolutionary dynamics of their genomes, i.e., the forces that shaped coding sequences in terms of selective constraints.


Briefings in Bioinformatics | 2011

Models, algorithms and programs for phylogeny reconciliation

Jean-Philippe Doyon; Vincent Ranwez; Vincent Daubin; Vincent Berry

Gene sequences contain a gold mine of phylogenetic information. But unfortunately for taxonomists this information does not only tell the story of the species from which it was collected. Genes have their own complex histories which record speciation events, of course, but also many other events. Among them, gene duplications, transfers and losses are especially important to identify. These events are crucial to account for when reconstructing the history of species, and they play a fundamental role in the evolution of genomes, the diversification of organisms and the emergence of new cellular functions. We review reconciliations between gene and species trees, which are rigorous approaches for identifying duplications, transfers and losses that mark the evolution of a gene family. Existing reconciliation models and algorithms are reviewed and difficulties in modeling gene transfers are discussed. We also compare different reconciliation programs along with their advantages and disadvantages.


BMC Bioinformatics | 2006

Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics.

Julien Y. Dutheil; Sylvain Gaillard; Eric Bazin; Sylvain Glémin; Vincent Ranwez; Nicolas Galtier; Khalid Belkhir

BackgroundA large number of bioinformatics applications in the fields of bio-sequence analysis, molecular evolution and population genetics typically share input/ouput methods, data storage requirements and data analysis algorithms. Such common features may be conveniently bundled into re-usable libraries, which enable the rapid development of new methods and robust applications.ResultsWe present Bio++, a set of Object Oriented libraries written in C++. Available components include classes for data storage and handling (nucleotide/amino-acid/codon sequences, trees, distance matrices, population genetics datasets), various input/output formats, basic sequence manipulation (concatenation, transcription, translation, etc.), phylogenetic analysis (maximum parsimony, markov models, distance methods, likelihood computation and maximization), population genetics/genomics (diversity statistics, neutrality tests, various multi-locus analyses) and various algorithms for numerical calculus.ConclusionImplementation of methods aims at being both efficient and user-friendly. A special concern was given to the library design to enable easy extension and new methods development. We defined a general hierarchy of classes that allow the developer to implement its own algorithms while remaining compatible with the rest of the libraries. Bio++ source code is distributed free of charge under the CeCILL general public licence from its website http://kimura.univ-montp2.fr/BioPP.

Collaboration


Dive into the Vincent Ranwez's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nicolas Galtier

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar

Vincent Berry

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar

Sylvain Glémin

University of Montpellier

View shared research outputs
Top Co-Authors

Avatar

Sylvain Santoni

Institut national de la recherche agronomique

View shared research outputs
Top Co-Authors

Avatar

Nicolas Fiorini

National Institutes of Health

View shared research outputs
Researchain Logo
Decentralizing Knowledge