Mikaël Salson | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mikaël Salson is active.

Explore More

Publication

Featured researches published by Mikaël Salson.

Genome Biology | 2013

CRAC: an integrated approach to the analysis of RNA-seq reads

Nicolas Philippe; Mikaël Salson; Thérèse Commes; Eric Rivals

A large number of RNA-sequencing studies set out to predict mutations, splice junctions or fusion RNAs. We propose a method, CRAC, that integrates genomic locations and local coverage to enable such predictions to be made directly from RNA-seq read analysis. A k-mer profiling approach detects candidate mutations, indels and splice or chimeric junctions in each single read. CRAC increases precision compared with existing tools, reaching 99:5% for splice junctions, without losing sensitivity. Importantly, CRAC predictions improve with read length. In cancer libraries, CRAC recovered 74% of validated fusion RNAs and predicted novel recurrent chimeric junctions. CRAC is available at http://crac.gforge.inria.fr.

BMC Genomics | 2014

Fast multiclonal clusterization of V(D)J recombinations from high-throughput sequencing

Mathieu Giraud; Mikaël Salson; Marc Duez; Céline Villenet; Sabine Quief; Aurélie Caillault; Nathalie Grardel; Christophe Roumier; Claude Preudhomme; Martin Figeac

BackgroundV(D)J recombinations in lymphocytes are essential for immunological diversity. They are also useful markers of pathologies. In leukemia, they are used to quantify the minimal residual disease during patient follow-up. However, the full breadth of lymphocyte diversity is not fully understood.ResultsWe propose new algorithms that process high-throughput sequencing (HTS) data to extract unnamed V(D)J junctions and gather them into clones for quantification. This analysis is based on a seed heuristic and is fast and scalable because in the first phase, no alignment is performed with germline database sequences. The algorithms were applied to TR γ HTS data from a patient with acute lymphoblastic leukemia, and also on data simulating hypermutations. Our methods identified the main clone, as well as additional clones that were not identified with standard protocols.ConclusionsThe proposed algorithms provide new insight into the analysis of high-throughput sequencing data for leukemia, and also to the quantitative assessment of any immunological profile. The methods described here are implemented in a C++ open-source program called Vidjil.

Blood | 2015

The predictive strength of next-generation sequencing MRD detection for relapse compared with current methods in childhood ALL

Michaela Kotrova; Katerina Muzikova; Ester Mejstrikova; Michaela Novakova; Violeta Bakardjieva-Mihaylova; Karel Fiser; Jan Stuchly; Mathieu Giraud; Mikaël Salson; Christiane Pott; Monika Brüggemann; Marc Füllgrabe; Jan Stary; Jan Trka; Eva Fronkova

To the editor: Minimal residual disease (MRD) monitoring via antigen receptor quantitative polymerase chain reaction (qPCR) is an important predictor of outcome in childhood acute lymphoblastic leukemia (ALL), is rigorously standardized within the EuroMRD consortium and has a greater sensitivity

Journal of Discrete Algorithms | 2010

Dynamic extended suffix arrays

Mikaël Salson; Thierry Lecroq; Martine Léonard; Laurent Mouchard

The suffix tree data structure has been intensively described, studied and used in the eighties and nineties, its linear-time construction counterbalancing his space-consuming requirements. An equivalent data structure, the suffix array, has been described by Manber and Myers in 1990. This space-economical structure has been neglected during more than a decade, its construction being too slow. Since 2003, several linear-time suffix array construction algorithms have been proposed, and this structure has slowly replaced the suffix tree in many string processing problems. All these constructions are building the suffix array from the text, and any edit operation on the text leads to the construction of a brand new suffix array. In this article, we are presenting an algorithm that modifies the suffix array and the Longest Common Prefix (LCP) array when the text is edited (insertion, substitution or deletion of a letter or a factor). This algorithm is based on a recent four-stage algorithm developed for dynamic Burrows-Wheeler Transforms (BWT). For minimizing the space complexity, we are sampling the Suffix Array, a technique used in BWT-based compressed indexes. We furthermore explain how this technique can be adapted for maintaining a sample of the Extended Suffix Array, containing a sample of the Suffix Array, a sample of the Inverse Suffix Array and the whole LCP array. Our practical experiments show that it operates very well in practice, being quicker than the fastest suffix array construction algorithm.

PLOS ONE | 2016

Vidjil: A Web Platform for Analysis of High-Throughput Repertoire Sequencing

Marc Duez; Mathieu Giraud; Ryan Herbert; Tatiana Rocher; Mikaël Salson; Florian Thonier

Background The B and T lymphocytes are white blood cells playing a key role in the adaptive immunity. A part of their DNA, called the V(D)J recombinations, is specific to each lymphocyte, and enables recognition of specific antigenes. Today, with new sequencing techniques, one can get billions of DNA sequences from these regions. With dedicated Repertoire Sequencing (RepSeq) methods, it is now possible to picture population of lymphocytes, and to monitor more accurately the immune response as well as pathologies such as leukemia. Methods and Results Vidjil is an open-source platform for the interactive analysis of high-throughput sequencing data from lymphocyte recombinations. It contains an algorithm gathering reads into clonotypes according to their V(D)J junctions, a web application made of a sample, experiment and patient database and a visualization for the analysis of clonotypes along the time. Vidjil is implemented in C++, Python and Javascript and licensed under the GPLv3 open-source license. Source code, binaries and a public web server are available at http://www.vidjil.org and at http://bioinfo.lille.inria.fr/vidjil. Using the Vidjil web application consists of four steps: 1. uploading a raw sequence file (typically a FASTQ); 2. running RepSeq analysis software; 3. visualizing the results; 4. annotating the results and saving them for future use. For the end-user, the Vidjil web application needs no specific installation and just requires a connection and a modern web browser. Vidjil is used by labs in hematology or immunology for research and clinical applications.

British Journal of Haematology | 2016

Multi-loci diagnosis of acute lymphoblastic leukaemia with high-throughput sequencing and bioinformatics analysis.

Yann Ferret; Aurélie Caillault; Shéhérazade Sebda; Marc Duez; Nathalie Grardel; Nicolas Duployez; Céline Villenet; Martin Figeac; Claude Preudhomme; Mikaël Salson; Mathieu Giraud

High‐throughput sequencing (HTS) is considered a technical revolution that has improved our knowledge of lymphoid and autoimmune diseases, changing our approach to leukaemia both at diagnosis and during follow‐up. As part of an immunoglobulin/T cell receptor‐based minimal residual disease (MRD) assessment of acute lymphoblastic leukaemia patients, we assessed the performance and feasibility of the replacement of the first steps of the approach based on DNA isolation and Sanger sequencing, using a HTS protocol combined with bioinformatics analysis and visualization using the Vidjil software. We prospectively analysed the diagnostic and relapse samples of 34 paediatric patients, thus identifying 125 leukaemic clones with recombinations on multiple loci (TRG, TRD, IGH and IGK), including Dd2/Dd3 and Intron/KDE rearrangements. Sequencing failures were halved (14% vs. 34%, P = 0.0007), enabling more patients to be monitored. Furthermore, more markers per patient could be monitored, reducing the probability of false negative MRD results. The whole analysis, from sample receipt to clinical validation, was shorter than our current diagnostic protocol, with equal resources. V(D)J recombination was successfully assigned by the software, even for unusual recombinations. This study emphasizes the progress that HTS with adapted bioinformatics tools can bring to the diagnosis of leukaemia patients.

Leukemia Research | 2017

High-throughput sequencing in acute lymphoblastic leukemia: Follow-up of minimal residual disease and emergence of new clones

Mikaël Salson; Mathieu Giraud; Aurélie Caillault; Nathalie Grardel; Nicolas Duployez; Yann Ferret; Marc Duez; Ryan Herbert; Tatiana Rocher; Shéhérazade Sebda; Sabine Quief; Céline Villenet; Martin Figeac; Claude Preudhomme

Minimal residual disease (MRD) is known to be an independent prognostic factor in patients with acute lymphoblastic leukemia (ALL). High-throughput sequencing (HTS) is currently used in routine practice for the diagnosis and follow-up of patients with hematological neoplasms. In this retrospective study, we examined the role of immunoglobulin/T-cell receptor-based MRD in patients with ALL by HTS analysis of immunoglobulin H and/or T-cell receptor gamma chain loci in bone marrow samples from 11 patients with ALL, at diagnosis and during follow-up. We assessed the clinical feasibility of using combined HTS and bioinformatics analysis with interactive visualization using Vidjil software. We discuss the advantages and drawbacks of HTS for monitoring MRD. HTS gives a more complete insight of the leukemic population than conventional real-time quantitative PCR (qPCR), and allows identification of new emerging clones at each time point of the monitoring. Thus, HTS monitoring of Ig/TR based MRD is expected to improve the management of patients with ALL.

Genome Biology | 2017

DE-kupl: exhaustive capture of biological variation in RNA-seq data through k-mer decomposition

Jérôme Audoux; Nicolas Philippe; Rayan Chikhi; Mikaël Salson; Mélina Gallopin; Marc Gabriel; Jérémy Le Coz; Emilie Drouineau; Thérèse Commes; Daniel Gautheret

We introduce a k-mer-based computational protocol, DE-kupl, for capturing local RNA variation in a set of RNA-seq libraries, independently of a reference genome or transcriptome. DE-kupl extracts all k-mers with differential abundance directly from the raw data files. This enables the retrieval of virtually all variation present in an RNA-seq data set. This variation is subsequently assigned to biological events or entities such as differential long non-coding RNAs, splice and polyadenylation variants, introns, repeats, editing or mutation events, and exogenous RNA. Applying DE-kupl to human RNA-seq data sets identified multiple types of novel events, reproducibly across independent RNA-seq experiments.

international workshop on combinatorial algorithms | 2014

Lossless Seeds for Searching Short Patterns with High Error Rates

Christophe Vroland; Mikaël Salson; Hélène Touzet

We address the problem of approximate pattern matching using the Levenshtein distance. Given a text T and a pattern P, find all locations in T that differ by at most k errors from P. For that purpose, we propose a filtration algorithm that is based on a novel type of seeds, combining exact parts and parts with a fixed number of errors. Experimental tests show that the method is specifically well-suited for short patterns with a large number of errors.

Journal of Discrete Algorithms | 2016

Approximate search of short patterns with high error rates using the 010 lossless seeds

Christophe Vroland; Mikaël Salson; Sébastien Bini; Hélène Touzet

Approximate pattern matching is an important computational problem that has a wide range of applications in computational biology and in information retrieval. However, searching a short pattern in a text with high error rates (1020%) under the Levenshtein distance is a task for which few efficient solutions exist. Here we address this problem by introducing a new type of seeds: the 010 seeds. These seeds are made of two exact parts separated by parts with exactly one error. We show that those seeds are lossless, and we apply them to two filtration algorithms for two popular applications, one where a compressed index is built on the text and another one where the patterns are indexed. We also demonstrate experimentally the advantages of our approach compared to alternative methods implementing other types of seeds. This work opens the way to the design of more efficient and more sensitive text algorithms.

Explore More