Christian Baudet | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christian Baudet is active.

Explore More

Publication

Featured researches published by Christian Baudet.

Bioinformatics | 2010

Cassis: detection of genomic rearrangement breakpoints

Christian Baudet; Claire Lemaitre; Zanoni Dias; Christian Gautier; Eric Tannier; Marie-France Sagot

Summary: Genomes undergo large structural changes that alter their organization. The chromosomal regions affected by these rearrangements are called breakpoints, while those which have not been rearranged are called synteny blocks. Lemaitre et al. presented a new method to precisely delimit rearrangement breakpoints in a genome by comparison with the genome of a related species. Receiving as input a list of one2one orthologous genes found in the genomes of two species, the method builds a set of reliable and non-overlapping synteny blocks and refines the regions that are not contained into them. Through the alignment of each breakpoint sequence against its specific orthologous sequences in the other species, we can look for weak similarities inside the breakpoint, thus extending the synteny blocks and narrowing the breakpoints. The identification of the narrowed breakpoints relies on a segmentation algorithm and is statistically assessed. Here, we present the package Cassis that implements this method of precise detection of genomic rearrangement breakpoints. Availability: Perl and R scripts are freely available for download at http://pbil.univ-lyon1.fr/software/Cassis/. Documentation with methodological background, technical aspects, download and setup instructions, as well as examples of applications are available together with the package. The package was tested on Linux and Mac OS environments and is distributed under the GNU GPL License. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Algorithms for Molecular Biology | 2015

EUCALYPT: efficient tree reconciliation enumerator

Beatrice Donati; Christian Baudet; Blerina Sinaimeri; Pierluigi Crescenzi; Marie-France Sagot

BackgroundPhylogenetic tree reconciliation is the approach of choice for investigating the coevolution of sets of organisms such as hosts and parasites. It consists in a mapping between the parasite tree and the host tree using event-based maximum parsimony. Given a cost model for the events, many optimal reconciliations are however possible. Any further biological interpretation of them must therefore take this into account, making the capacity to enumerate all optimal solutions a crucial point. Only two algorithms currently exist that attempt such enumeration; in one case not all possible solutions are produced while in the other not all cost vectors are currently handled. The objective of this paper is two-fold. The first is to fill this gap, and the second is to test whether the number of solutions generally observed can be an issue in terms of interpretation.ResultsWe present a polynomial-delay algorithm for enumerating all optimal reconciliations. We show that in general many solutions exist. We give an example where, for two pairs of host-parasite trees having each less than 41 leaves, the number of solutions is 5120, even when only time-feasible ones are kept. To facilitate their interpretation, those solutions are also classified in terms of how many of each event they contain. The number of different classes of solutions may thus be notably smaller than the number of solutions, yet they may remain high enough, in particular for the cases where losses have cost 0. In fact, depending on the cost vector, both numbers of solutions and of classes thereof may increase considerably. To further deal with this problem, we introduce and analyse a restricted version where host switches are allowed to happen only between species that are within some fixed distance along the host tree. This restriction allows us to reduce the number of time-feasible solutions while preserving the same optimal cost, as well as to find time-feasible solutions with a cost close to the optimal in the cases where no time-feasible solution is found.ConclusionsWe present Eucalypt, a polynomial-delay algorithm for enumerating all optimal reconciliations which is freely available at http://eucalypt.gforge.inria.fr/.

Systematic Biology | 2015

Cophylogeny Reconstruction via an Approximate Bayesian Computation

Christian Baudet; Beatrice Donati; Blerina Sinaimeri; Pierluigi Crescenzi; Christian Gautier; Catherine Matias; Marie-France Sagot

Despite an increasingly vast literature on cophylogenetic reconstructions for studying host–parasite associations, understanding the common evolutionary history of such systems remains a problem that is far from being solved. Most algorithms for host–parasite reconciliation use an event-based model, where the events include in general (a subset of) cospeciation, duplication, loss, and host switch. All known parsimonious event-based methods then assign a cost to each type of event in order to find a reconstruction of minimum cost. The main problem with this approach is that the cost of the events strongly influences the reconciliation obtained. Some earlier approaches attempt to avoid this problem by finding a Pareto set of solutions and hence by considering event costs under some minimization constraints. To deal with this problem, we developed an algorithm, called Coala, for estimating the frequency of the events based on an approximate Bayesian computation approach. The benefits of this method are 2-fold: (i) it provides more confidence in the set of costs to be used in a reconciliation, and (ii) it allows estimation of the frequency of the events in cases where the data set consists of trees with a large number of taxa. We evaluate our method on simulated and on biological data sets. We show that in both cases, for the same pair of host and parasite trees, different sets of frequencies for the events lead to equally probable solutions. Moreover, often these solutions differ greatly in terms of the number of inferred events. It appears crucial to take this into account before attempting any further biological interpretation of such reconciliations. More generally, we also show that the set of frequencies can vary widely depending on the input host and parasite trees. Indiscriminately applying a standard vector of costs may thus not be a good strategy.

brazilian symposium on bioinformatics | 2014

Length and Symmetry on the Sorting by Weighted Inversions Problem

Christian Baudet; Ulisses Dias; Zanoni Dias

Large-scale mutational events that occur when stretches of DNA sequence move throughout genomes are called genome rearrangement events. In bacteria, inversions are one of the most frequently observed rearrangements. In some bacterial families, inversions are biased in favor of symmetry as shown by recent research [6, 8, 10]. In addition, several results suggest that short segment inversions are more frequent in the evolution of microbial genomes [4,6,15]. Despite the fact that symmetry and length of the reversed segments seem very important, they have not been considered together in any problem in the genome rearrangement field. Here, we define the problem of sorting genomes (or permutations) using inversions whose costs are assigned based on their lengths and asymmetries. We present five procedures and we assess these procedure performances on small sized permutations. The ideas presented in this paper provide insights to solve the problem and set the stage for a proper theoretical analysis.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2017

Sorting Circular Permutations by Super Short Reversals

Gustavo Rodrigues Galvão; Christian Baudet; Zanoni Dias

We consider the problem of sorting a circular permutation by super short reversals (i.e., reversals of length at most 2), a problem that finds application in comparative genomics. Polynomial-time solutions to the unsigned version of this problem are known, but the signed version remained open. In this paper, we present the first polynomial-time solution to the signed version of this problem. Moreover, we perform experiments for inferring phylogenies of two different groups of bacterial species and compare our results with the phylogenies presented in previous works. Finally, to facilitate phylogenetic studies based on the methods studied in this paper, we present a web tool for rearrangement-based phylogenetic inference using short operations, such as super short reversals.

international symposium on bioinformatics research and applications | 2015

Sorting Signed Circular Permutations by Super Short Reversals

Gustavo Rodrigues Galvão; Christian Baudet; Zanoni Dias

We consider the problem of sorting a circular permutation by reversals of length at most 2, a problem that finds application in comparative genomics. Polynomial-time solutions for the unsigned version of this problem are known, but the signed version remained open. In this paper, we present the first polynomial-time solution for the signed version of this problem. Moreover, we perform an experiment for inferring distances and phylogenies for published Yersinia genomes and compare the results with the phylogenies presented in previous works.

international conference on bioinformatics | 2013

Greedy Randomized Search Procedure to Sort Genomes using Symmetric, Almost-Symmetric and Unitary Inversions

Ulisses Dias; Christian Baudet; Zanoni Dias

Genome Rearrangement is a field that addresses the problem of finding the minimum number of global operations that transform one given genome into another. In this work we develop an algorithm for three constrained versions of the event called inversion, which occurs when a chromosome breaks at two locations called breakpoints and the DNA between the breakpoints is reversed. The constrained versions are called symmetric, almost-symmetric and unitary inversions. In this paper, we present a greedy randomized search procedure to find the minimum number of such operations between two genomes. Our approach is, to our knowledge, the first genome rearrangement problem modeled using this metaheuristic. Our model is an iterative process in which each iteration receives a feasible solution whose neighborhood is investigated for a better solution. This search uses greediness to shape the candidate list and randomness to select elements from the list. A previous greedy heuristic was used as an initial solution. In almost every case, we were able to improve that initial solution by providing a new sequence of inversions that uses less operations. For permutations of size 10, our solutions were, on average, 5 inversions shorter than the initial solution. For permutations of size 15 and 20, our solutions were, on average, 10 and 16 inversions shorter than the initial solution, respectively. For longer permutations ranging from 25 to 50 elements, we generated solutions that were, on average, 20--22 inversions shorter than the initial solution. We believe that the method proposed in this work can be adjusted to other genome rearrangement problems.

Algorithms for Molecular Biology | 2012

Sampling solution traces for the problem of sorting permutations by signed reversals

Christian Baudet; Zanoni Dias; Marie-France Sagot

BackgroundTraditional algorithms to solve the problem of sorting by signed reversals output just one optimal solution while the space of all optimal solutions can be huge. A so-called trace represents a group of solutions which share the same set of reversals that must be applied to sort the original permutation following a partial ordering. By using traces, we therefore can represent the set of optimal solutions in a more compact way. Algorithms for enumerating the complete set of traces of solutions were developed. However, due to their exponential complexity, their practical use is limited to small permutations. A partial enumeration of traces is a sampling of the complete set of traces and can be an alternative for the study of distinct evolutionary scenarios of big permutations. Ideally, the sampling should be done uniformly from the space of all optimal solutions. This is however conjectured to be ♯ P-complete.ResultsWe propose and evaluate three algorithms for producing a sampling of the complete set of traces that instead can be shown in practice to preserve some of the characteristics of the space of all solutions. The first algorithm (RA) performs the construction of traces through a random selection of reversals on the list of optimal 1-sequences. The second algorithm (DFALT) consists in a slight modification of an algorithm that performs the complete enumeration of traces. Finally, the third algorithm (SWA) is based on a sliding window strategy to improve the enumeration of traces. All proposed algorithms were able to enumerate traces for permutations with up to 200 elements.ConclusionsWe analysed the distribution of the enumerated traces with respect to their height and average reversal length. Various works indicate that the reversal length can be an important aspect in genome rearrangements. The algorithms RA and SWA show a tendency to lose traces with high average reversal length. Such traces are however rare, and qualitatively our results show that, for testable-sized permutations, the algorithms DFALT and SWA produce distributions which approximate the reversal length distributions observed with a complete enumeration of the set of traces.

Genetics and Molecular Research | 2006