Pierre Peterlongo
University of Marne-la-Vallée
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pierre Peterlongo.
Algorithms for Molecular Biology | 2009
Pierre Peterlongo; Gustavo Sacomoto; Alair Pereira do Lago; Nadia Pisanti; Marie-France Sagot
BackgroundIdentifying local similarity between two or more sequences, or identifying repeats occurring at least twice in a sequence, is an essential part in the analysis of biological sequences and of their phylogenetic relationship. Finding such fragments while allowing for a certain number of insertions, deletions, and substitutions, is however known to be a computationally expensive task, and consequently exact methods can usually not be applied in practice.ResultsThe filter TUIUIU that we introduce in this paper provides a possible solution to this problem. It can be used as a preprocessing step to any multiple alignment or repeats inference method, eliminating a possibly large fraction of the input that is guaranteed not to contain any approximate repeat. It consists in the verification of several strong necessary conditions that can be checked in a fast way. We implemented three versions of the filter. The first is simply a straightforward extension to the case of multiple sequences of an application of conditions already existing in the literature. The second uses a stronger condition which, as our results show, enable to filter sensibly more with negligible (if any) additional time. The third version uses an additional condition and pushes the sensibility of the filter even further with a non negligible additional time in many circumstances; our experiments show that it is particularly useful with large error rates. The latter version was applied as a preprocessing of a multiple alignment tool, obtaining an overall time (filter plus alignment) on average 63 and at best 530 times smaller than before (direct alignment), with in most cases a better quality alignment.ConclusionTo the best of our knowledge, TUIUIU is the first filter designed for multiple repeats and for dealing with error rates greater than 10% of the repeats length.
string processing and information retrieval | 2005
Pierre Peterlongo; Nadia Pisanti; Frédéric Boyer; Marie-France Sagot
Similarity search in texts, notably biological sequences, has received substantial attention in the last few years. Numerous filtration and indexing techniques have been created in order to speed up the resolution of the problem. However, previous filters were made for speeding up pattern matching, or for finding repetitions between two sequences or occurring twice in the same sequence. In this paper, we present an algorithm called NIMBUS for filtering sequences prior to finding repetitions occurring more than twice in a sequence or in more than two sequences. NIMBUS uses gapped seeds that are indexed with a new data structure, called a bi-factor array, that is also presented in this paper. Experimental results show that the filter can be very efficient: preprocessing with NIMBUS a data set where one wants to find functional elements using a multiple local alignment tool such as GLAM ([7]), the overall execution time can be reduced from 10 hours to 6 minutes while obtaining exactly the same results.
international conference on implementation and application of automata | 2006
Pavlos Antoniou; Jan Holub; Costas S. Iliopoulos; Bořivoj Melichar; Pierre Peterlongo
We present an algorithm that uses finite automata to find the common motifs with gaps occurring in all strings belonging to a finite set S = {S1,S2,...,Sr}. In order to find these common motifs we must first identify the factors that exist in each string. Therefore the algorithm begins by constructing a factor automaton for each string Si. To find the common factors of all the strings, the algorithm needs to gather all the factors from the strings together in one data structure and this is achieved by computing an automaton that accepts the union of the above-mentioned automata. Using this automaton we are able to create a new factor alphabet. Based on this factor alphabet a finite automaton is created for each string Si that accepts sequences of all non overlapping factors residing in each string. The intersection of the latter automata produces the finite automaton which accepts all the common subsequences with gaps over the factor alphabet that are present in all the strings of the set S = {S1,S2,...,Sr}. These common subsequences are the common motifs of the strings.
language and automata theory and applications | 2007
Pavlos Antoniou; Maxime Crochemore; Costas S. Iliopoulos; Pierre Peterlongo
prague stringology conference | 2006
Pierre Peterlongo; Julien Allali; Marie-France Sagot
Archive | 2011
Pierre Peterlongo; Rayan Chikhi
International Conference on Holobionts | 2017
Arnaud Meng; Erwan Corre; Pierre Peterlongo; Camille Marchet; Adriana Alberti; Corinne Da Silva; Patrick Wincker; Ian Probert; Noritoshi Suzuki; Stéphane Le Crom; Lucie Bittner; Fabrice Not
Archive | 2016
Pierre Peterlongo; Antoine Limasset
Archive | 2016
Sebastien Letort; Pierre Peterlongo; Dominique Lavenier; Claire Lemaitre; Fabrice Legeai; Patrick Durand; Charles Deltel
JOBIM 2016 | 2016
Antoine Limasset; Camille Marchet; Pierre Peterlongo; Lucie Bittner