Laurent Mouchard | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Laurent Mouchard is active.

Explore More

Publication

Featured researches published by Laurent Mouchard.

Proceedings of the National Academy of Sciences of the United States of America | 2004

Whole-genome shotgun assembly and comparison of human genome assemblies

Sorin Istrail; Granger Sutton; Liliana Florea; Aaron L. Halpern; Clark M. Mobarry; Ross A. Lippert; Brian Walenz; Hagit Shatkay; Ian M. Dew; Jason R. Miller; Michael Flanigan; Nathan Edwards; Randall Bolanos; Daniel Fasulo; Bjarni V. Halldórsson; Sridhar Hannenhalli; Russell Turner; Shibu Yooseph; Fu Lu; Deborah Nusskern; Bixiong Shue; Xiangqun Holly Zheng; Fei Zhong; Arthur L. Delcher; Daniel H. Huson; Saul Kravitz; Laurent Mouchard; Knut Reinert; Karin A. Remington; Andrew G. Clark

We report a whole-genome shotgun assembly (called WGSA) of the human genome generated at Celera in 2001. The Celera-generated shotgun data set consisted of 27 million sequencing reads organized in pairs by virtue of end-sequencing 2-kbp, 10-kbp, and 50-kbp inserts from shotgun clone libraries. The quality-trimmed reads covered the genome 5.3 times, and the inserts from which pairs of reads were obtained covered the genome 39 times. With the nearly complete human DNA sequence [National Center for Biotechnology Information (NCBI) Build 34] now available, it is possible to directly assess the quality, accuracy, and completeness of WGSA and of the first reconstructions of the human genome reported in two landmark papers in February 2001 [Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 1304–1351; International Human Genome Sequencing Consortium (2001) Nature 409, 860–921]. The analysis of WGSA shows 97% order and orientation agreement with NCBI Build 34, where most of the 3% of sequence out of order is due to scaffold placement problems as opposed to assembly errors within the scaffolds themselves. In addition, WGSA fills some of the remaining gaps in NCBI Build 34. The early genome sequences all covered about the same amount of the genome, but they did so in different ways. The Celera results provide more order and orientation, and the consortium sequence provides better coverage of exact and nearly exact repeats.

International Journal of Computer Mathematics | 2002

Algorithms For Computing Approximate Repetitions In Musical Sequences

Emilios Cambouropoulos; Maxime Crochemore; Costas S. Iliopoulos; Laurent Mouchard; Yoan J. Pinzón

Here we introduce two new notions of approximate matching with application in computer assisted music analysis. We present algorithms for each notion of approximation: for approximate string matching and for computing approximate squares.

Advances in Experimental Medicine and Biology | 2010

A fast and efficient algorithm for mapping short sequences to a reference genome.

Pavlos Antoniou; Costas S. Iliopoulos; Laurent Mouchard; Solon P. Pissis

Novel high-throughput (Deep) sequencing technology methods have redefined the way genome sequencing is performed. They are able to produce tens of millions of short sequences (reads) in a single experiment and with a much lower cost than previous sequencing methods. In this paper, we present a new algorithm for addressing the problem of efficiently mapping millions of short reads to a reference genome. In particular, we define and solve the Massive Approximate Pattern Matching problem for mapping short sequences to a reference genome.

international conference on bioinformatics | 2010

REAL: an efficient REad ALigner for next generation sequencing reads

Kimon Frousios; Costas S. Iliopoulos; Laurent Mouchard; Solon P. Pissis; German Tischler

Motivation: The constant advances in sequencing technology are turning whole-genome sequencing into a routine procedure, resulting in massive amounts of data that need to be processed. Tens of gigabytes of data in the form of short reads need to be mapped back to reference sequences, a few gigabases long. A first generation of short read alignment software successfully employed hash tables, and the current second generation uses Burrows-Wheeler Transform, further improving mapping speed. However, there is still demand for faster and more accurate mapping. Results: In this paper, we present REad ALigner, an efficient, accurate and consistent tool for aligning short reads obtained from next generation sequencing. It is based on a new, simple, yet efficient mapping algorithm that can match and outperform current BWT-based software.

Journal of Discrete Algorithms | 2010

Dynamic extended suffix arrays

Mikaël Salson; Thierry Lecroq; Martine Léonard; Laurent Mouchard

The suffix tree data structure has been intensively described, studied and used in the eighties and nineties, its linear-time construction counterbalancing his space-consuming requirements. An equivalent data structure, the suffix array, has been described by Manber and Myers in 1990. This space-economical structure has been neglected during more than a decade, its construction being too slow. Since 2003, several linear-time suffix array construction algorithms have been proposed, and this structure has slowly replaced the suffix tree in many string processing problems. All these constructions are building the suffix array from the text, and any edit operation on the text leads to the construction of a brand new suffix array. In this article, we are presenting an algorithm that modifies the suffix array and the Longest Common Prefix (LCP) array when the text is edited (insertion, substitution or deletion of a letter or a factor). This algorithm is based on a recent four-stage algorithm developed for dynamic Burrows-Wheeler Transforms (BWT). For minimizing the space complexity, we are sampling the Suffix Array, a technique used in BWT-based compressed indexes. We furthermore explain how this technique can be adapted for maintaining a sample of the Extended Suffix Array, containing a sample of the Suffix Array, a sample of the Inverse Suffix Array and the whole LCP array. Our practical experiments show that it operates very well in practice, being quicker than the fastest suffix array construction algorithm.

Journal of Computational Biology | 2006

Computation of repetitions and regularities of biologically weighted sequences.

Manolis Christodoulakis; Costas S. Iliopoulos; Laurent Mouchard; Katerina Perdikuri; Athanasios K. Tsakalidis; Kostas Tsichlas

Biological weighted sequences are used extensively in molecular biology as profiles for protein families, in the representation of binding sites and often for the representation of sequences produced by a shotgun sequencing strategy. In this paper, we address three fundamental problems in the area of biologically weighted sequences: (i) computation of repetitions, (ii) pattern matching, and (iii) computation of regularities. Our algorithms can be used as basic building blocks for more sophisticated algorithms applied on weighted sequences.

Mathematics in Computer Science | 2008

A New Approach to Pattern Matching in Degenerate DNA/RNA Sequences and Distributed Pattern Matching

Costas S. Iliopoulos; Laurent Mouchard; M. Sohel Rahman

Abstract.In this paper, we consider the pattern matching problem in DNA and RNA sequences where either the pattern or the text can be degenerate, i.e., contain sets of characters. We present an asymptotically faster algorithm for the above problem that works in O(n log m) time, where n and m is the length of the text and the pattern respectively. We also suggest an efficient implementation of our algorithm, which works in linear time when the pattern size is small. Finally, we also describe how our approach can be used to solve the distributed pattern matching problem.

ieee international conference on information technology and applications in biomedicine | 2009

Mapping uniquely occurring short sequences derived from high throughput technologies to a reference genome

Pavlos Antoniou; Jackie W. Daykin; Costas S. Iliopoulos; Derrick G. Kourie; Laurent Mouchard; Solon P. Pissis

Novel high throughput sequencing technology methods have redefined the way genome sequencing is performed. They are able to produce tens of millions of short sequences (reads) in a single experiment and with a much lower cost than previous sequencing methods. Due to this massive amount of data generated by the above systems, efficient algorithms for mapping short sequences to a reference genome are in great demand. In this paper, we present a practical algorithm for addressing the problem of efficiently mapping uniquely occuring short reads to a reference genome. This requires the classification of these short reads into unique and duplicate matches. In particular, we define and solve the Massive Exact Unique Pattern Matching problem in genomes.

medical image computing and computer assisted intervention | 2012

Segmentation of biological target volumes on multi-tracer PET images based on information fusion for achieving dose painting in radiotherapy

Benoît Lelandais; Isabelle Gardin; Laurent Mouchard; Pierre Vera; Su Ruan

Medical imaging plays an important role in radiotherapy. Dose painting consists in the application of a nonuniform dose prescription on a tumoral region, and is based on an efficient segmentation of biological target volumes (BTV). It is derived from PET images, that highlight tumoral regions of enhanced glucose metabolism (FDG), cell proliferation (FLT) and hypoxia (FMiso). In this paper, a framework based on Belief Function Theory is proposed for BTV segmentation and for creating 3D parametric images for dose painting. We propose to take advantage of neighboring voxels for BTV segmentation, and also multi-tracer PET images using information fusion to create parametric images. The performances of BTV segmentation was evaluated on an anthropomorphic phantom and compared with two other methods. Quantitative results show the good performances of our method. It has been applied to data of five patients suffering from lung cancer. Parametric images show promising results by highlighting areas where a high frequency or dose escalation could be planned.

Lecture Notes in Computer Science | 2001

The Max-Shift Algorithm for Approximate String Matching

Costas S. Iliopoulos; Laurent Mouchard; Yoan J. Pinzón

The approximate string matching problem is to find all locations which a pattern of length m matches a substring of a text of length n with at most k differences. The program agrep is a simple and practical bit-vector algorithm for this problem. In this paper we consider the following incremental version of the problem: given an appropriate encoding of a comparison between A and bB, can one compute the answer for A and B, and the answer for A and Bc with equal efficiency, where b and c are additional symbols? Here we present an elegant and veryeasy to implement bit-vector algorithm for answering these questions that requires only O(n⌈m/w⌉) time, where n is the length of A, m is the length of B and w is the number of bits in a machine word. We also present an O(nm⌈h/w⌉) algorithm for the fixed-length approximate string matching problem: given a text t, a pattern p and an integer h, compute the optimal alignment of all substrings of p of length h and a substring of t.

Explore More